# Smithers

> Deterministic, resumable AI workflow orchestration using JSX.
> Source: https://smithers.sh
> GitHub: https://github.com/evmts/smithers
> Package: smithers-orchestrator on npm

This file contains the complete Smithers documentation. Each section below corresponds to a documentation page on smithers.sh.

---

## Smithers

> Durable AI workflow orchestration for multi-step agent work that needs structure, visibility, and resumability.
> Source: https://smithers.sh/index

Smithers is what you reach for when one agent call turns into a real process.

The moment the job becomes "analyze, fix, validate, ask for approval, then resume tomorrow," you need a workflow runtime, not a longer prompt. Smithers gives you that runtime. You author the graph in JSX, run and steer it from the CLI, and rely on durable state in SQLite so completed work survives crashes and restarts.

## Start Here

- [Quickstart](/quickstart) for the fastest path to a successful run.
- [CLI Quickstart](/cli/quickstart) if you need to operate Smithers from the terminal.
- [JSX API](/jsx/overview) if you want to build your own workflows.
- [Workflows Overview](/concepts/workflows-overview) if you want the mental model first.

## Choose the Path That Matches Your Job

### I need something running today

- [Installation](/installation) gets the workflow pack into your project.
- [Quickstart](/quickstart) walks through your first run.
- [CLI Quickstart](/cli/quickstart) shows how to inspect, resume, and unblock runs.

### I need to build workflows

- [JSX API](/jsx/overview) explains the authoring model.
- [JSX Quickstart](/jsx/quickstart) builds a small workflow from scratch.
- [Tutorial: Build a Workflow](/guides/tutorial-workflow) expands that into a production-style example.
- [Components](/components/workflow) is the reference set for the JSX surface.

### I need production operations

- [CLI Reference](/cli/overview) is the full command reference.
- [Observability](/guides/monitoring-logs) covers logs, spans, and metrics.
- [Resumability](/guides/resumability) explains what survives failure and how to recover.
- [Time Travel Quickstart](/guides/time-travel-quickstart) covers replay, diff, and fork workflows.

## How Smithers Is Organized

| Surface | What it is for | Start here |
| --- | --- | --- |
| CLI | Run, inspect, resume, approve, debug, and operate workflows | [CLI Quickstart](/cli/quickstart) |
| JSX API | Define workflows as TSX components with typed outputs | [JSX API](/jsx/overview) |
| Runtime API | Execute or render workflows programmatically | [runWorkflow](/runtime/run-workflow) |
| Concepts | Learn the execution model and data flow | [Workflows Overview](/concepts/workflows-overview) |
| Guides | Learn common production patterns | [Best Practices](/guides/best-practices) |

## Why Teams Adopt It

- Completed task outputs are persisted to SQLite, so resumed runs skip finished work.
- Zod schemas make task output contracts explicit instead of "hope the model returns JSON."
- JSX keeps the workflow readable to humans and writable by code assistants.
- Approval gates, signals, time travel, and observability are part of the runtime instead of custom glue.

## A Good First Reading Order

1. [Quickstart](/quickstart)
2. [CLI Quickstart](/cli/quickstart)
3. [JSX API](/jsx/overview)
4. [Workflows Overview](/concepts/workflows-overview)
5. [Review Loop Guide](/guides/review-loop)

---

## Introduction

> Understand what Smithers is, how the CLI and JSX API fit together, and when the workflow runtime is worth using.
> Source: https://smithers.sh/introduction

Think of Smithers as a runtime for [durable agent work](/concepts/execution-model).

If React turns state into UI, Smithers turns state into executable work. Each render answers one question: given what has already completed, what can run now?

That matters because real agent jobs do not stay single-step for long. A useful task becomes:

- analyze a codebase
- propose a fix
- apply the fix
- run validation
- ask for [approval](/concepts/approvals)
- [resume later](/concepts/suspend-and-resume) if the process crashes

You can script all of that by hand. Most teams do at first. Then the retries, [persistence](/concepts/workflow-state), audit trail, and [branching logic](/concepts/control-flow) start taking over the project. Smithers exists to make that coordination the default instead of the afterthought.

## The Three Surfaces

### CLI

The CLI is the operations surface. You use it to scaffold workflow packs, launch runs, inspect state, read logs, answer [approvals](/concepts/approvals), send [signals](/components/signal), and recover from failure.

Start with [CLI Quickstart](/cli/quickstart) if that is your immediate job.

### JSX API

The JSX API is the authoring surface. You describe the workflow as a tree of [`<Workflow>`](/components/workflow), [`<Task>`](/components/task), and [control-flow components](/concepts/control-flow), and Smithers repeatedly renders that tree as outputs become available.

Start with [JSX API](/jsx/overview) if you are building workflows.

### Runtime

The runtime is the durable engine beneath both surfaces. It validates [structured output](/guides/structured-output), persists it to [SQLite](https://sqlite.org), emits [events](/runtime/events), and resumes safely after interruption.

Start with [Workflows Overview](/concepts/workflows-overview) if you want the model before the mechanics.

## What Happens During a Run

1. Smithers renders your workflow tree with the current `ctx`.
2. It finds the tasks that are ready to execute.
3. It runs those tasks and validates their outputs.
4. It persists the outputs and runtime metadata to SQLite.
5. It renders again with the updated state.

That loop continues until the workflow finishes, fails, pauses for [human input](/concepts/human-in-the-loop), or is cancelled.

## When Smithers Is the Right Tool

Use Smithers when:

- order matters across multiple AI or compute steps
- you need [resumability](/guides/resumability) or crash recovery
- humans must [approve](/concepts/approvals) or answer questions mid-run
- different tasks need different [models](/guides/model-selection), [tools](/integrations/tools), or policies
- you want the workflow itself to stay readable and testable

If you only need one prompt and one response, a workflow is probably overkill.

## Read Next

- [Installation](/installation) to get the workflow pack into your project.
- [Quickstart](/quickstart) for the fastest first run.
- [CLI Quickstart](/cli/quickstart) to learn the operational flow.
- [JSX Quickstart](/jsx/quickstart) to build a workflow from scratch.
- [Workflows Overview](/concepts/workflows-overview) for the mental model.
- [Execution Model](/concepts/execution-model) to understand how renders, state, and resumability fit together.

---

## Installation

> Install smithers-orchestrator with the workflow pack, or manually for standalone JSX workflow projects.
> Source: https://smithers.sh/installation

Most teams should start with the workflow pack. It gives you a working `.smithers/` directory, seeded workflows, prompts, and agent configuration instead of asking you to assemble the project structure by hand.

All commands on this page use [`bunx`](https://bun.sh) `smithers-orchestrator ...`. The published npm package is [`smithers-orchestrator`](https://www.npmjs.com/package/smithers-orchestrator), so do not use bare `smithers` or `bunx smithers` for these install commands.

## Recommended: Install the Workflow Pack

```bash
bunx smithers-orchestrator init
```

That scaffolds `.smithers/` with files such as:

| Directory / File | Contents |
|---|---|
| `.smithers/workflows/` | Pre-built workflows (`implement`, `review`, `plan`, `ralph`, `debug`, ...) |
| `.smithers/prompts/` | Shared MDX prompt templates |
| `.smithers/components/` | Reusable TSX components (`Review`, `ValidationLoop`, ...) |
| `.smithers/package.json` | Local workflow project manifest with `smithers-orchestrator` dependency |
| `.smithers/tsconfig.json` | TypeScript config for JSX workflow authoring |
| `.smithers/bunfig.toml` | Bun preload config for MDX workflow prompts |
| `.smithers/preload.ts` | Registers the MDX preload plugin |
| `.smithers/agents.ts` | Auto-detected agent configuration |
| `.smithers/smithers.config.ts` | Repo-level config (lint, test, coverage commands) |
| `.smithers/tickets/` | Ticket workspace used by ticket-oriented workflows |
| `.smithers/executions/` | Execution artifacts directory preserved across re-inits |
| `.smithers/.gitignore` | Ignore rules for generated workflow state |

To overwrite an existing scaffold:

```bash
bunx smithers-orchestrator init --force
```

## When to Use Manual Installation

Use manual installation when you are embedding Smithers into an existing TypeScript codebase and want to author a standalone [workflow project](/guides/project-structure) from scratch.

See [JSX Installation](/jsx/installation) for the package list, TypeScript configuration, and optional MDX prompt setup.

## Requirements

- [Bun](https://bun.sh) >= 1.3
- TypeScript >= 5
- Model or provider credentials (e.g. [Anthropic](https://docs.anthropic.com) `ANTHROPIC_API_KEY`)

## After Installation

Choose the next page based on what you need:

- [Quickstart](/quickstart) to run a seeded workflow immediately.
- [CLI Quickstart](/cli/quickstart) to learn the operational workflow.
- [JSX Installation](/jsx/installation) to set up manual TSX authoring.
- [Project Structure](/guides/project-structure) to understand how a standalone workflow project fits together.
- [Tools Integration](/integrations/tools) to understand the built-in tool sandbox.

---

## Quickstart

> Get Smithers running end to end: scaffold the workflow pack, launch a workflow, inspect the run, and choose your next path.
> Source: https://smithers.sh/quickstart

By the end of this page you will have done the four things that matter for a real first run:

1. installed the [workflow pack](/installation)
2. launched a workflow
3. inspected the resulting run
4. chosen whether to go deeper on [CLI operations](/cli/quickstart) or [JSX authoring](/jsx/quickstart)

## 1. Scaffold the Workflow Pack

```bash
bunx smithers-orchestrator init
```

That creates a local `.smithers/` directory with seeded workflows, shared prompts, reusable components, and agent configuration.

## 2. Launch a Seeded Workflow

```bash
bunx smithers-orchestrator workflow run implement --prompt "Add rate limiting to the API"
```

This runs the seeded `implement` workflow that `init` installs into `.smithers/workflows/`.
It is the fastest way to see Smithers behave like a [workflow runtime](/concepts/workflows-overview) instead of a library on a shelf.

## 3. Inspect the Run

Once the workflow starts, use the CLI to see what happened:

```bash
bunx smithers-orchestrator ps
bunx smithers-orchestrator inspect <run-id>
bunx smithers-orchestrator logs <run-id> --tail 20 --follow false
```

Use `ps` to find the run ID if you do not already have it.

## 4. Understand What You Just Used

Every workflow gets:

- [SQLite](https://sqlite.org) persistence
- [Resume after restarts](/concepts/suspend-and-resume)
- Retries, [approvals](/concepts/approvals), [loops](/components/loop)
- Observability and [CLI tooling](/cli/overview)

You just used both Smithers surfaces together:

- the [workflow pack](/installation) gave you runnable workflows
- the [CLI](/cli/overview) launched, inspected, and reported on the run

## Next Steps

- [CLI Quickstart](/cli/quickstart) to learn the terminal workflow in a more systematic way.
- [CLI Overview](/cli/overview) for the full command surface, including logs, approvals, and signals.
- [JSX Quickstart](/jsx/quickstart) to build a workflow from scratch.
- [Workflows Overview](/concepts/workflows-overview) to connect the run you just started to the runtime model.

---

## JSX API

> Build Smithers workflows as JSX trees with render-time branching, reusable components, and MDX prompts.
> Source: https://smithers.sh/jsx/overview

The [JSX](https://react.dev/learn/writing-markup-with-jsx) API is Smithers' component-based authoring layer. You describe the workflow as a tree of [`<Workflow>`](/components/workflow), [`<Task>`](/components/task), and [control-flow](/concepts/control-flow) components, and Smithers renders that tree into an execution plan.

You can mix normal [`<Task>`](/components/task) nodes and [MDX prompt](/guides/mdx-prompts) components in the same JSX tree.

Use JSX when you want:

- component composition and reusable workflow fragments
- explicit control-flow nodes like [`<Approval>`](/components/approval), [`<Parallel>`](/components/parallel), `<MergeQueue>`, and [`<Worktree>`](/components/worktree)
- [render-time branching](/concepts/reactivity) driven by `ctx.outputMaybe(...)`
- [MDX prompt templates](/guides/mdx-prompts)
- a TSX-first workflow authoring model

## What JSX Looks Like

```tsx
/** @jsxImportSource smithers-orchestrator */
import { createSmithers, Sequence, Task } from "smithers-orchestrator";
import { z } from "zod";

const analysisSchema = z.object({
  summary: z.string(),
});

const { Workflow, smithers, outputs } = createSmithers({
  analysis: analysisSchema,
});

export default smithers((ctx) => (
  <Workflow name="analyze-repo">
    <Sequence>
      <Task id="analyze" output={outputs.analysis}>
        {{ summary: `Analyze ${ctx.input.repo}` }}
      </Task>
    </Sequence>
  </Workflow>
));
```

The `outputs` object returned by `createSmithers` maps each schema key to its [Zod](https://zod.dev) schema. Passing `output={outputs.analysis}` instead of a magic string gives you compile-time type checking — a typo like `output={outputs.anaylsis}` is a type error, not a runtime surprise.

## How JSX Execution Works

The JSX API is [render-driven](/concepts/execution-model):

1. Smithers renders the workflow tree with the current `ctx`.
2. It extracts executable task descriptors from the rendered tree.
3. It runs the ready tasks and persists their outputs.
4. It renders again with the updated outputs in `ctx`.

That means branching and task visibility are usually expressed with normal JSX conditions:

```tsx
const analysis = ctx.outputMaybe(outputs.analysis, { nodeId: "analyze" });

return (
  <Workflow name="code-review">
    <Task id="analyze" output={outputs.analysis}>...</Task>
    {analysis ? (
      <Task id="fix" output={outputs.fix}>...</Task>
    ) : null}
  </Workflow>
);
```

## Two Common JSX Styles

### Schema-driven JSX

Use `createSmithers(...)` with [Zod](https://zod.dev) schemas and the returned `outputs` object. This is the fastest JSX path and the best default for new workflows. The `output` prop is type-checked against the registered schemas.

### Manual JSX

Use `smithers(db, build)` with explicit Drizzle table objects when you want lower-level control over persistence.

Both styles compile to the same JSX renderer and execution engine.

## Why JSX Works Well

- component composition keeps large workflows modular
- normal JSX conditions make branching and gating easy to read
- [TypeScript](https://www.typescriptlang.org) and [Zod](https://zod.dev) keep workflow data explicit and type-checked
- MDX prompts fit naturally into the same authoring model

## Next Steps

- [JSX Installation](/jsx/installation) — Set up Bun, TypeScript, and optional MDX prompts.
- [JSX Quickstart](/jsx/quickstart) — Build a two-step workflow.
- [Execution Model](/concepts/execution-model) — Understand the render and run loop behind JSX workflows.
- [Workflow State](/concepts/workflow-state) — Learn how `ctx.outputMaybe(...)` reads persisted task outputs.
- [Control Flow](/concepts/control-flow) — Choose between branching, approvals, loops, and parallel paths.
- [Workflow](/components/workflow) — Start with the root component reference.

---

## JSX Installation

> Install Smithers for the JSX workflow API.
> Source: https://smithers.sh/jsx/installation

The JSX API uses [TSX](https://react.dev/learn/writing-markup-with-jsx) to define workflows and the [AI SDK](https://ai-sdk.dev) to call models. The fastest setup is the schema-driven API with `createSmithers(...)`.

## Prerequisites

- [Bun](https://bun.sh) >= 1.3
- [TypeScript](https://www.typescriptlang.org) >= 5

## Install

Install the core JSX dependencies:

```bash
bun add smithers-orchestrator ai @ai-sdk/anthropic zod
```

Add TypeScript and the extra type packages the current `smithers-orchestrator` exports need for `tsc --noEmit`:

```bash
bun add -d typescript @types/bun @types/ws @types/diff
```

You do not need to add `@types/react` or `@types/react-dom` separately just to use the Smithers JSX runtime.

## TypeScript Configuration

Create a `tsconfig.json` like this:

```json
{
  "compilerOptions": {
    "target": "ESNext",
    "module": "ESNext",
    "lib": ["ESNext", "DOM", "DOM.Iterable"],
    "moduleResolution": "bundler",
    "jsx": "react-jsx",
    "jsxImportSource": "smithers-orchestrator",
    "strict": true,
    "noEmit": true,
    "skipLibCheck": true
  }
}
```

`jsxImportSource` is the key setting. TypeScript resolves it through the exported `smithers-orchestrator/jsx-runtime` and `smithers-orchestrator/jsx-dev-runtime` entry points.

## Project Shape

A minimal JSX workflow project usually looks like this. For a larger production layout, see [Project Structure](/guides/project-structure):

```txt
my-workflow/
  package.json
  tsconfig.json
  workflow.tsx
  main.ts
```

## Optional: MDX Prompt Files

If you want [MDX prompt](/guides/mdx-prompts) templates in `.mdx` files, register the MDX preload plugin:

```ts
// preload.ts
import { mdxPlugin } from "smithers-orchestrator";

mdxPlugin();
```

```toml
# bunfig.toml
preload = ["./preload.ts"]
```

And add the MDX types:

```bash
bun add -d @types/mdx
```


## Verify the Setup

Once TypeScript and JSX are configured, run the typecheck to validate your setup:

```bash
bunx tsc --noEmit
```

Then run the sample workflow:

```bash
bun run main.ts
```

If you want to verify the [CLI](/cli/overview) entrypoint too:

```bash
bunx smithers-orchestrator --help
```

## Next Steps

- [JSX Overview](/jsx/overview) — See how JSX workflows render, branch, and compose.
- [JSX Quickstart](/jsx/quickstart) — Build a working two-step workflow.
- [Project Structure](/guides/project-structure) — Organize TSX files, schemas, and prompts for larger projects.
- [MDX Prompts](/guides/mdx-prompts) — Use `.mdx` files as structured prompt templates.
- [CLI Quickstart](/cli/quickstart) — Run workflows from the CLI as well as programmatically.
- [Package Configuration](/reference/package-configuration) — Review exports, scripts, and build settings.

---

## JSX Quickstart

> Build and run your first Smithers workflow with the JSX API.
> Source: https://smithers.sh/jsx/quickstart

This guide builds a two-step [workflow](/concepts/workflows-overview) that researches a topic and then writes a report once the research output exists.

It uses the [AI SDK](https://ai-sdk.dev) with [Anthropic](https://docs.anthropic.com) and validates task outputs with [Zod](https://zod.dev).

## Step 1: Create the Workflow

Create `workflow.tsx`:

```tsx
/** @jsxImportSource smithers-orchestrator */
import { createSmithers, Sequence, Task } from "smithers-orchestrator";
import { ToolLoopAgent as Agent } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
import { z } from "zod";

const { Workflow, smithers, outputs } = createSmithers({
  research: z.object({
    summary: z.string(),
    keyPoints: z.array(z.string()),
  }),
  output: z.object({
    title: z.string(),
    body: z.string(),
  }),
});

const researcher = new Agent({
  model: anthropic("claude-sonnet-4-20250514"),
  instructions: "You are an expert research assistant.",
});

const writer = new Agent({
  model: anthropic("claude-sonnet-4-20250514"),
  instructions: "You are a concise technical writer.",
});

export default smithers((ctx) => {
  const research = ctx.outputMaybe(outputs.research, { nodeId: "research" });

  return (
    <Workflow name="research-report">
      <Sequence>
        <Task id="research" output={outputs.research} agent={researcher}>
          {`Research the following topic and return a summary with key points.\n\nTopic: ${ctx.input.topic}`}
        </Task>

        {research ? (
          <Task id="report" output={outputs.output} agent={writer}>
            {`Write a concise report.\n\nSummary: ${research.summary}\nKey points: ${research.keyPoints.join(", ")}`}
          </Task>
        ) : null}
      </Sequence>
    </Workflow>
  );
});
```

Two JSX-specific details matter here:

- `ctx.outputMaybe(outputs.research, { nodeId: "research" })` is how the second render discovers that `research` has finished; see [Workflow State](/concepts/workflow-state) for the persisted lookup model.
- the `report` [`<Task>`](/components/task) only mounts once the `research` output exists.

## Step 2: Run It

Create `main.ts`:

```ts
import { runWorkflow } from "smithers-orchestrator";
import workflow from "./workflow";

const result = await runWorkflow(workflow, {
  input: { topic: "The history of the Zig programming language" },
});

console.log(result.status);
if (result.status === "finished") {
  console.log(JSON.stringify(result.output, null, 2));
}
```

Run it:

```bash
bun run main.ts
```

Or run the workflow directly with the [CLI](/cli/quickstart):

```bash
bunx smithers-orchestrator up workflow.tsx --input '{"topic":"The history of the Zig programming language"}'
```

## What Happened

1. Smithers rendered the JSX tree. Only `research` was mounted; see [Render Frame](/runtime/render-frame) for the render step in detail.
2. The `research` task ran, validated its output against the Zod schema, and persisted it.
3. Smithers rendered again with that stored output available through `ctx.outputMaybe(...)`; see [Reactivity](/concepts/reactivity) for how new tasks mount on later renders.
4. The `report` task mounted on the second render and used the `research` output in its prompt.

## Next Steps

- [JSX Overview](/jsx/overview) — See how JSX rendering, branching, and composition work.
- [Workflow](/components/workflow) — Learn the root workflow component.
- [Task](/components/task) — See agent, compute, and static task modes.
- [runWorkflow](/runtime/run-workflow) — Learn the programmatic runtime entry point used in `main.ts`.
- [Tutorial: Build a Workflow](/guides/tutorial-workflow) — Build a larger production-style JSX workflow.

---

## Tutorial: Build a Workflow

> Step-by-step guide to building a Smithers workflow with schemas, agents, sequential tasks, and output access.
> Source: https://smithers.sh/guides/tutorial-workflow

## 1. Project Setup

```bash
mkdir code-review && cd code-review
bun init -y
bun add smithers-orchestrator ai @ai-sdk/anthropic zod
```

```json
// tsconfig.json
{
  "compilerOptions": {
    "target": "ESNext",
    "module": "ESNext",
    "moduleResolution": "bundler",
    "jsx": "react-jsx",
    "jsxImportSource": "smithers-orchestrator",
    "strict": true,
    "noEmit": true,
    "skipLibCheck": true
  }
}
```

```bash
export ANTHROPIC_API_KEY="sk-ant-..."
```

```
code-review/
  tsconfig.json
  package.json
  workflow.tsx      # Workflow definition
  main.ts          # Runner (optional -- CLI works too)
```

## 2. Define Schemas

Each [Zod](https://zod.dev) schema passed to `createSmithers` becomes a named, auto-created SQLite output table.

```tsx
/** @jsxImportSource smithers-orchestrator */
// workflow.tsx
import { createSmithers, Task, Sequence } from "smithers-orchestrator";
import { z } from "zod";

const { Workflow, smithers, outputs } = createSmithers({
  analysis: z.object({
    summary: z.string(),
    issues: z.array(z.object({
      file: z.string(),
      line: z.number(),
      severity: z.enum(["low", "medium", "high"]),
      description: z.string(),
    })),
  }),
  fix: z.object({
    patch: z.string(),
    explanation: z.string(),
    filesChanged: z.array(z.string()),
  }),
  report: z.object({
    title: z.string(),
    body: z.string(),
    issueCount: z.number(),
    fixedCount: z.number(),
  }),
});
```

`outputs` provides typed references (`outputs.analysis` instead of the string `"analysis"`). Typos become compile errors. `runId`, `nodeId`, and `iteration` columns are auto-added.

## 3. Configure Agents

This example uses the [Vercel AI SDK](https://ai-sdk.dev) with [Anthropic Claude](https://docs.anthropic.com) models.

```tsx
/** @jsxImportSource smithers-orchestrator */
// workflow.tsx (continued)
import { ToolLoopAgent as Agent } from "ai";
import { anthropic } from "@ai-sdk/anthropic";

const analyst = new Agent({
  model: anthropic("claude-sonnet-4-20250514"),
  instructions: "You are a senior code reviewer. Analyze code for bugs, security issues, and quality problems. Return structured JSON.",
});

const fixer = new Agent({
  model: anthropic("claude-sonnet-4-20250514"),
  instructions: "You are a senior engineer who writes minimal, correct fixes. Return structured JSON with a unified diff patch.",
});

const reporter = new Agent({
  model: anthropic("claude-sonnet-4-20250514"),
  instructions: "You are a technical writer. Summarize code review findings into a clear report. Return structured JSON.",
});
```

## 4. Build the Workflow

```tsx
/** @jsxImportSource smithers-orchestrator */
// workflow.tsx (continued)
export default smithers((ctx) => {
  const analysis = ctx.outputMaybe(outputs.analysis, { nodeId: "analyze" });
  const fix = ctx.outputMaybe(outputs.fix, { nodeId: "fix" });

  return (
    <Workflow name="code-review">
      <Sequence>
        <Task id="analyze" output={outputs.analysis} agent={analyst}>
          {`Review this code for bugs and issues:

Repository: ${ctx.input.repo}
Focus area: ${ctx.input.focusArea ?? "general"}

Return JSON with:
- summary (string): overall assessment
- issues (array): each with file, line, severity, and description`}
        </Task>

        {analysis ? (
          <Task id="fix" output={outputs.fix} agent={fixer}>
            {`Fix these issues:

${analysis.issues.map((i) => `- [${i.severity}] ${i.file}:${i.line} - ${i.description}`).join("\n")}

Return JSON with:
- patch (string): unified diff
- explanation (string): what you changed and why
- filesChanged (string[]): list of modified files`}
          </Task>
        ) : null}

        {fix ? (
          <Task id="report" output={outputs.report} agent={reporter}>
            {`Write a code review report.

Analysis summary: ${analysis!.summary}
Issues found: ${analysis!.issues.length}
Fix explanation: ${fix.explanation}
Files changed: ${fix.filesChanged.join(", ")}

Return JSON with:
- title (string)
- body (string): markdown report
- issueCount (number)
- fixedCount (number)`}
          </Task>
        ) : null}
      </Sequence>
    </Workflow>
  );
});
```

- `ctx.outputMaybe()` returns `undefined` until the task completes. Safe for [reactive control flow](/concepts/reactivity).
- `{analysis ? <Task .../> : null}` gates downstream [tasks](/components/task) on upstream completion inside a [Sequence](/components/sequence).
- `ctx.input` is the runtime input object (here: `{ repo: string, focusArea?: string }`). See the [data model](/concepts/data-model).

## 5. Create the Runner

```ts
// main.ts
import { runWorkflow } from "smithers-orchestrator";
import workflow from "./workflow";

const result = await runWorkflow(workflow, {
  input: { repo: "/path/to/my-project", focusArea: "authentication" },
  onProgress: (event) => {
    if (event.type === "NodeStarted") {
      console.log(`Starting: ${event.nodeId}`);
    }
    if (event.type === "NodeFinished") {
      console.log(`Finished: ${event.nodeId}`);
    }
  },
});

console.log("Status:", result.status);
console.log("Run ID:", result.runId);

if (result.status === "finished") {
  console.log("Run finished. Inspect the persisted report row in Step 7.");
}
```

Because the final schema key here is `report` rather than `output`, `result.output` stays `undefined`. Rename that schema key to `output` if you want [`runWorkflow()`](/runtime/run-workflow) to return it directly.

## 6. Run It

```bash
bun run main.ts
```

Or via the [CLI](/cli/quickstart) (no `main.ts` needed):

```bash
bunx smithers-orchestrator up workflow.tsx --input '{"repo": "/path/to/my-project", "focusArea": "authentication"}'
```

## 7. Inspect Results

```bash
RUN_ID="your-run-id"
bunx smithers-orchestrator inspect "$RUN_ID"
bunx smithers-orchestrator graph workflow.tsx --run-id "$RUN_ID"
bunx smithers-orchestrator logs "$RUN_ID" --tail 5 --follow false
```

Query SQLite directly:

```bash
sqlite3 smithers.db "SELECT * FROM analysis WHERE run_id = '$RUN_ID';"
sqlite3 smithers.db "SELECT * FROM report WHERE run_id = '$RUN_ID';"
```

## Execution Model

The engine renders the JSX tree repeatedly, following the [execution model](/concepts/execution-model):

1. **Render 1** -- Only `analyze` is mounted. Engine executes it.
2. **Render 2** -- `ctx.outputMaybe(outputs.analysis)` returns data. `fix` mounts and executes.
3. **Render 3** -- Both outputs available. `report` mounts and executes.
4. **Render 4** -- All tasks finished. Run completes.

On crash, [resume](/guides/resumability) skips completed tasks:

```bash
bunx smithers-orchestrator up workflow.tsx --run-id "$RUN_ID" --resume true
```

## Adding Tools

```tsx
/** @jsxImportSource smithers-orchestrator */
import { read, grep, bash } from "smithers-orchestrator";

const analyst = new Agent({
  model: anthropic("claude-sonnet-4-20250514"),
  instructions: "You are a senior code reviewer.",
  tools: { read, grep, bash },
});
```

Tools are sandboxed to the workflow root by default. See [Built-in Tools](/integrations/tools).

## Next Steps

- [Run Workflow](/runtime/run-workflow) -- Runner API options and result handling.
- [CLI Quickstart](/cli/quickstart) -- Run, inspect, and resume workflows from the CLI.
- [Structured Output](/guides/structured-output) -- Schema validation in depth.
- [Error Handling](/guides/error-handling) -- Retries, timeouts, fallback paths.
- [Resumability](/guides/resumability) -- Crash recovery and deterministic replay.
- [Patterns](/guides/patterns) -- Project structure for larger workflows.

---

## Production Project Structure

> Recommended file structure for production Smithers workflows with MDX prompts, component-per-step, and Zod schemas.
> Source: https://smithers.sh/guides/project-structure

For 1-5 tasks, a [single-file pattern](/guides/patterns#single-file-pattern) suffices. For anything larger:

## Directory Layout

```
scripts/my-workflow/
  workflow.tsx          # Root workflow -- thin, just composition
  smithers.ts           # createSmithers() + schema registry
  agents.ts             # Agent definitions (CLI + API SDK)
  config.ts             # Shared constants (max iterations, etc.)
  system-prompt.ts      # Build system prompt from MDX + docs
  preload.ts            # MDX plugin registration
  bunfig.toml           # preload = ["./preload.ts"]
  package.json
  tsconfig.json
  run.sh                # Shell script to launch workflow
  components/
    index.ts            # Re-export all components
    Discover.tsx        # Step component
    Discover.schema.ts  # Zod schema for output
    Discover.mdx        # Prompt template
    Implement.tsx
    Implement.schema.ts
    Implement.mdx
    Validate.tsx
    Validate.schema.ts
    Validate.mdx
    Review.tsx
    Review.schema.ts
    Review.mdx
    ReviewFix.tsx
    ReviewFix.schema.ts
    ReviewFix.mdx
    Report.tsx
    Report.schema.ts
    Report.mdx
    TicketPipeline.tsx   # Composed pipeline per ticket
    ValidationLoop.tsx   # Loop: implement -> validate -> review -> fix
  prompts/
    system-prompt.mdx    # Master system prompt template
    *.md                 # Domain-specific context docs
```

## Rationale

| Principle | Effect |
|---|---|
| MDX prompts | Prompt engineering separated from orchestration logic |
| Schema files | Per-step `.schema.ts` with Zod; auto-creates SQLite tables, validates output |
| One component per step | Composable, independently testable |
| Thin `workflow.tsx` | Root file only composes components |
| Shared `agents.ts` | Single place to configure and swap models |

## Key Files

### `smithers.ts` -- Schema Registry

```ts
import { createSmithers } from "smithers-orchestrator";
import { DiscoverOutput } from "./components/Discover.schema";
import { ImplementOutput } from "./components/Implement.schema";
import { ValidateOutput } from "./components/Validate.schema";
import { ReviewOutput } from "./components/Review.schema";
import { ReviewFixOutput } from "./components/ReviewFix.schema";
import { ReportOutput } from "./components/Report.schema";

export const { Workflow, Task, useCtx, smithers, tables, outputs } = createSmithers({
  discover: DiscoverOutput,
  implement: ImplementOutput,
  validate: ValidateOutput,
  review: ReviewOutput,
  reviewFix: ReviewFixOutput,
  report: ReportOutput,
}, { dbPath: "./my-workflow.db" });
```

### `agents.ts`

See [Model Selection](/guides/model-selection) for the dual-agent setup pattern.

### `config.ts`

```ts
export const MAX_REVIEW_ROUNDS = 3;
export const IMPLEMENT_TIMEOUT_MS = 45 * 60 * 1000;
export const REVIEW_TIMEOUT_MS = 15 * 60 * 1000;
```

### `preload.ts` + `bunfig.toml`

```ts
// preload.ts
import { mdxPlugin } from "smithers-orchestrator";
mdxPlugin();
```

```toml
# bunfig.toml
preload = ["./preload.ts"]

[test]
preload = ["./preload.ts"]
```

### `workflow.tsx`

```tsx
import { Sequence, Branch } from "smithers-orchestrator";
import { Discover, TicketPipeline } from "./components";
import { Ticket } from "./components/Discover.schema";
import { Workflow, smithers, tables } from "./smithers";

export default smithers((ctx) => {
  const discoverOutput = ctx.latest(tables.discover, "discover-codex");
  const unfinishedTickets = ctx
    .latestArray(discoverOutput?.tickets, Ticket)
    .filter((t) => !ctx.latest(tables.report, `${t.id}:report`)) as Ticket[];

  return (
    <Workflow name="my-workflow">
      <Sequence>
        <Branch if={unfinishedTickets.length === 0} then={<Discover />} />
        {unfinishedTickets.map((ticket) => (
          <TicketPipeline key={ticket.id} ticket={ticket} />
        ))}
      </Sequence>
    </Workflow>
  );
});
```

### `components/index.ts`

```ts
export { Discover } from "./Discover";
export { Implement } from "./Implement";
export { Validate } from "./Validate";
export { Review } from "./Review";
export { ReviewFix } from "./ReviewFix";
export { Report } from "./Report";
export { ValidationLoop } from "./ValidationLoop";
export { TicketPipeline } from "./TicketPipeline";
export type { Ticket } from "./Discover.schema";
```

### `run.sh`

```bash
#!/usr/bin/env bash
set -euo pipefail

SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
ROOT_DIR="$(cd "$SCRIPT_DIR/../.." && pwd)"

cd "$SCRIPT_DIR"

export USE_CLI_AGENTS=1
export SMITHERS_UNSAFE=1

echo "Starting workflow"
bunx smithers-orchestrator up workflow.tsx --input '{}' --root "$ROOT_DIR"
```

### `package.json`

```json
{
  "name": "my-workflow",
  "type": "module",
  "scripts": {
    "start": "bun run workflow.tsx",
    "resume": "smithers up workflow.tsx --run-id <run-id> --resume true",
    "typecheck": "tsc --noEmit"
  },
  "dependencies": {
    "@ai-sdk/anthropic": "^3.0.36",
    "@ai-sdk/openai": "^2.0.0",
    "@mdx-js/esbuild": "^3.1.1",
    "@mdx-js/mdx": "^3.1.1",
    "@types/mdx": "^2.0.13",
    "ai": "^6.0.69",
    "react-dom": "^19.2.4",
    "smithers-orchestrator": "latest",
    "zod": "^4.3.6"
  },
  "devDependencies": {
    "@types/node": "^25.2.2",
    "@types/react": "^19.2.13",
    "@types/react-dom": "^19.2.3",
    "typescript": "^5.9.3"
  }
}
```

### `tsconfig.json`

```json
{
  "compilerOptions": {
    "target": "ES2022",
    "module": "ESNext",
    "lib": ["ES2022"],
    "jsx": "react-jsx",
    "jsxImportSource": "smithers-orchestrator",
    "moduleResolution": "bundler",
    "allowImportingTsExtensions": true,
    "resolveJsonModule": true,
    "noEmit": true,
    "strict": true,
    "skipLibCheck": true,
    "types": ["@types/mdx", "@types/react-dom", "@types/node"]
  },
  "include": ["**/*.ts", "**/*.tsx", "**/*.mdx"],
  "exclude": ["node_modules"]
}
```

## The Component Pattern

Each step follows a three-file pattern: schema, prompt, component.

### 1. Schema (`Component.schema.ts`)

```ts
import { z } from "zod";

export const ImplementOutput = z.object({
  filesCreated: z.array(z.string()).nullable(),
  filesModified: z.array(z.string()).nullable(),
  commitMessages: z.array(z.string()),
  whatWasDone: z.string(),
  allTestsPassing: z.boolean(),
  testOutput: z.string(),
});
export type ImplementOutput = z.infer<typeof ImplementOutput>;
```

### 2. Prompt (`Component.mdx`)

```mdx
IMPLEMENTATION -- Ticket: {props.ticketId} -- {props.ticketTitle}

{props.ticketDescription}

ACCEPTANCE CRITERIA:
- {props.acceptanceCriteria}

{props.previousImplementation
  ? `PREVIOUS ATTEMPT:\n${props.previousImplementation.whatWasDone}\nFix issues from previous attempt.`
  : ""}

{props.reviewFixes
  ? `REVIEW FIXES NEEDED:\n${props.reviewFixes}`
  : ""}

**REQUIRED OUTPUT** -- JSON matching this schema:
{props.schema}
```

`{props.schema}` is auto-injected by Smithers from the Zod schema.

### 3. Component (`Component.tsx`)

```tsx
import { Task, useCtx, tables, outputs } from "../smithers";
import { codex } from "../agents";
import ImplementPrompt from "./Implement.mdx";
import type { Ticket } from "./Discover.schema";

export function Implement({ ticket }: { ticket: Ticket }) {
  const ctx = useCtx();
  const ticketId = ticket.id;
  const latestValidate = ctx.latest(tables.validate, `${ticketId}:validate`);

  return (
    <Task id={`${ticketId}:implement`} output={outputs.implement} agent={codex} timeoutMs={45 * 60 * 1000}>
      <ImplementPrompt
        ticketId={ticketId}
        ticketTitle={ticket.title}
        ticketDescription={ticket.description}
        acceptanceCriteria={ticket.acceptanceCriteria?.join("\n- ") ?? ""}
        validationFeedback={latestValidate ?? null}
      />
    </Task>
  );
}
```

## Next Steps

- [MDX Prompts](/guides/mdx-prompts) -- MDX for system prompts and per-step prompts.
- [Implement-Review Loop](/guides/review-loop) -- The ValidationLoop component pattern.
- [Dynamic Tickets](/guides/dynamic-tickets) -- Agent-driven ticket discovery.
- [Patterns](/guides/patterns) -- Naming conventions, output access patterns.

---

## CLI Quickstart

> Learn the core Smithers terminal workflow: launch a run, inspect it, read logs, answer approvals, and resume safely.
> Source: https://smithers.sh/cli/quickstart

This page is about using Smithers, not authoring it.

If JSX defines the workflow graph, the CLI is how you actually operate that graph in day-to-day work.

## Before You Start

You need:

- a project with `.smithers/` installed via `bunx smithers-orchestrator init`
- provider or agent credentials configured for the workflow you want to run
- a discovered workflow under `.smithers/workflows` or an explicit `.tsx` workflow file

## 1. Start a Run

Run a discovered workflow from the local workflow pack:

```bash
bunx smithers-orchestrator workflow run implement --prompt "Add pagination to the activity feed"
```

`workflow run <name>` resolves `.smithers/workflows/<name>.tsx`. The `--prompt` flag maps the string to `input.prompt`.

Or run an explicit workflow file:

```bash
bunx smithers-orchestrator up workflow.tsx --input '{"task":"Add pagination to the activity feed"}'
```

## 2. Find the Run You Care About

```bash
bunx smithers-orchestrator ps
```

This shows recent and active runs. Once you have the run ID, everything else becomes precise.

## 3. Inspect the Run

```bash
bunx smithers-orchestrator inspect <run-id>
```

Use `inspect` when you need the structured view: status, current steps, approvals, outputs, retries, and loop state.

## 4. Read the Live Trail

Use logs for lifecycle events. `--tail` shows recent events first and `--follow` keeps tailing:

```bash
bunx smithers-orchestrator logs <run-id> --tail 50 --follow
```

Use chat for agent prompts, model replies, and stderr. `--tail` limits chat blocks and `--follow` watches for new output:

```bash
bunx smithers-orchestrator chat <run-id> --tail 20 --follow
```

Use `node` when you need one node's details instead of the whole run. Pass the node ID directly and add `--run-id` when you want to scope it to a specific run:

```bash
bunx smithers-orchestrator node <node-id> --run-id <run-id>
```

## 5. Handle Pauses

If a workflow pauses for approval:

```bash
bunx smithers-orchestrator approve <run-id>
# or
bunx smithers-orchestrator deny <run-id>
```

If a workflow is waiting on a signal:

```bash
bunx smithers-orchestrator signal <run-id> wait-for-input --data '{"choice":"ship-it"}'
```

If you are not sure why the run is blocked:

```bash
bunx smithers-orchestrator why <run-id>
```

## 6. Resume Safely

If the process exits or your machine restarts, resume the same run ID with the same entrypoint you started with:

```bash
bunx smithers-orchestrator workflow run implement --run-id <run-id> --resume
# or, if you launched a file directly:
bunx smithers-orchestrator up workflow.tsx --run-id <run-id> --resume
```

That is the durability story in one command. Completed tasks are not re-run.

## 7. Know When to Switch to JSX

Stay in the CLI when your job is:

- launching workflows
- inspecting state
- debugging and recovery
- approvals, signals, and operations

Switch to JSX when your job is:

- defining tasks and schemas
- changing workflow structure
- adding branching, loops, approvals, or subflows
- building reusable workflow components

## Next Steps

- [CLI Reference](/cli/overview) for the full command surface.
- [JSX API](/jsx/overview) if you want to build or modify workflows.
- [Debugging](/guides/debugging) for failure analysis.
- [Resumability](/guides/resumability) for the crash-recovery model.

---

## CLI Reference

> Complete reference for the current smithers-orchestrator command-line interface: scaffold workflow packs, run workflows, inspect runs, manage approvals, hijack sessions, and operate local tooling.
> Source: https://smithers.sh/cli/overview

```bash
bunx smithers-orchestrator <command> [options]
```

The CLI is the operational surface for Smithers. JSX defines the workflow. The CLI is how you launch it, inspect it, answer approvals, recover from interruption, and understand why it is blocked.

## Start With These Commands

| Goal | Command |
|---|---|
| Install the local workflow pack | `bunx smithers-orchestrator init` |
| Launch a workflow file | `bunx smithers-orchestrator up workflow.tsx --input '{"task":"..."}'` |
| Launch a discovered pack workflow | `bunx smithers-orchestrator workflow implement --prompt "..."` |
| List recent and active runs | `bunx smithers-orchestrator ps` |
| Inspect one run deeply | `bunx smithers-orchestrator inspect <runId>` |
| Read event logs | `bunx smithers-orchestrator logs <runId>` |
| Read agent prompts and replies | `bunx smithers-orchestrator chat <runId>` |
| Approve or deny a pause | `bunx smithers-orchestrator approve <runId>` / `bunx smithers-orchestrator deny <runId>` |
| Resume after interruption | `bunx smithers-orchestrator up workflow.tsx --run-id <runId> --resume true` |

## How the CLI Fits With JSX

- JSX is the authoring model.
- The CLI is the operations model.
- Both talk to the same runtime and the same persisted run state.

If you are new to Smithers, read [CLI Quickstart](/cli/quickstart) before using this page as a full reference.

## Command Map

| Command | Purpose |
|---|---|
| `init` | Install a local `.smithers/` workflow pack. |
| `up` | Start or resume a workflow execution. |
| `tui` | Open the interactive Smithers observability dashboard. |
| `ps` | List active, paused, and recently completed runs. |
| `logs` | Stream lifecycle events for a run. |
| `chat` | Show agent prompts, responses, and stderr. |
| `inspect` | Print structured run state, steps, approvals, and loop info. |
| `node` | Show enriched node details for debugging retries, tool calls, and output. |
| `events` | Query run event history with filters, grouping, and NDJSON output. |
| `why` | Explain why a run is currently blocked or paused. |
| `approve` / `deny` | Record a decision for a pending approval gate. |
| `signal` | Deliver external data to a waiting signal node. |
| `supervise` | Watch for stale runs and auto-resume them. |
| `cancel` / `down` | Stop one run or all active runs. |
| `hijack` | Hand off an agent session or conversation. |
| `alerts` | List, acknowledge, resolve, or silence durable alerts. |
| `graph` | Render a workflow tree without executing. |
| `revert` | Restore workspace to a previous task attempt snapshot. |
| `retry-task` | Retry a specific task within a run, then resume. |
| `timetravel` | Revert to a previous task state with filesystem + DB reset. |
| `replay` | Fork from a checkpoint and resume execution (time travel). |
| `diff` | Compare two time-travel snapshots. |
| `fork` | Create a branched run from a snapshot checkpoint. |
| `timeline` | View execution timeline for a run and its forks. |
| `scores` | View scorer results for a run. |
| `observability` | Start or stop the local Grafana/Prometheus/Tempo stack. |
| `ask` | Query an installed agent CLI via MCP. |
| `human` | List and resolve durable human requests from a `<HumanTask>` node. |
| `agents` | Inspect built-in CLI agent capability registries and validate metadata. |
| `memory` | View and query cross-run memory facts. |
| `rag` | Ingest documents and query the RAG knowledge base. |
| `openapi` | Preview AI SDK tools generated from an OpenAPI spec. |
| `workflow` | Discover, create, resolve, and run flat workflows in `.smithers/workflows`. |
| `cron` | Register and run background schedule triggers. |
| `completions` | Generate shell completion scripts. |
| `mcp add` | Register Smithers as an MCP server for an agent integration. |
| `skills` | Sync skill files to agent integrations. |

## Resolution Rules

### Workflow paths and IDs

`up`, `graph`, and `revert` take explicit workflow file paths:

```bash
bunx smithers-orchestrator up workflow.tsx --input '{"topic":"Zig"}'
bunx smithers-orchestrator graph workflow.tsx
bunx smithers-orchestrator revert workflow.tsx --run-id run-123 --node-id analyze
```

A workflow file can be invoked directly without `up`:

```bash
bunx smithers-orchestrator workflow.tsx --input '{"topic":"Zig"}'
```

The `workflow` command family resolves IDs under `.smithers/workflows/*.tsx`:

```bash
bunx smithers-orchestrator workflow implement --prompt "Add input validation"
bunx smithers-orchestrator workflow run implement --prompt "Add input validation"
```

`bunx smithers-orchestrator workflow run <name>` is a real subcommand. `bunx smithers-orchestrator workflow <name>` is a shorthand that rewrites to it.

### Finding persisted state

Commands that operate on existing runs locate the nearest `smithers.db` by walking upward from the working directory:

`ps`, `logs`, `events`, `chat`, `inspect`, `node`, `why`, `scores`, `approve`, `deny`, `signal`, `cancel`, `down`, `supervise`, `diff`, `timeline`, `workflow doctor`, `cron start`, `cron list`, `cron rm`

No database found: exit with error.

### Boolean options

Boolean flags accept either bare form or explicit `true` / `false`. Use `false` to disable a default-on option:

```bash
bunx smithers-orchestrator logs run-123 --follow false
bunx smithers-orchestrator chat run-123 --stderr false
bunx smithers-orchestrator hijack run-123 --launch false
bunx smithers-orchestrator up workflow.tsx --log false
bunx smithers-orchestrator observability --down true
```

## Commands

### init

Install the local workflow pack into `.smithers/`.

```bash
bunx smithers-orchestrator init [options]
```

| Option | Description |
|---|---|
| `--force <boolean>` | Overwrite existing scaffold files. Default: `false`. The help output shows this as a bare boolean flag. |

Generated pack contents:

- `.smithers/workflows/` -- seeded workflows (`implement`, `review`, `plan`, `ticket`, `tickets`, `ralph`, `improve-test-coverage`, `test-first`, `debug`)
- `.smithers/prompts/` -- shared MDX prompt templates
- `.smithers/components/` -- reusable workflow components
- `.smithers/agents.ts` -- agent configuration based on installed CLIs
- `.smithers/preload.ts`, `.smithers/bunfig.toml`, `package.json`, `tsconfig.json`

```bash
bunx smithers-orchestrator init
bunx smithers-orchestrator workflow implement --prompt "Commit the new .smithers pack"
```

### up

Start or resume a workflow execution.

```bash
bunx smithers-orchestrator up <workflow> [options]
```

| Option | Description |
|---|---|
| `--detach`, `-d <boolean>` | Background mode; print `runId`/`pid`/`logFile` and exit. Default: `false`. |
| `--run-id`, `-r <string>` | Explicit run ID. |
| `--max-concurrency`, `-c <number>` | Maximum parallel tasks. Default: `4`. |
| `--root <string>` | Tool sandbox root. Default: workflow file's parent directory. |
| `--log <boolean>` | NDJSON event log output. Default: `true`. |
| `--log-dir <string>` | NDJSON log directory. |
| `--allow-network <boolean>` | Allow `bash` tool network access. Does not affect CLI-backed agents. |
| `--max-output-bytes <number>` | Max bytes per tool call return. |
| `--tool-timeout-ms <number>` | Max wall-clock time per tool call. |
| `--hot <boolean>` | Hot reload for `.tsx` workflows. Default: `false`. |
| `--input`, `-i <string>` | Input JSON string. |
| `--resume <boolean>` | Resume existing run. Default: `false`. |
| `--force <boolean>` | Resume even if run is marked `running`. Default: `false`. |
| `--resume-claim-owner <string>` | Internal durable resume claim owner. |
| `--resume-claim-heartbeat <number>` | Internal durable resume claim heartbeat. |
| `--resume-restore-owner <string>` | Internal durable resume restore owner. |
| `--resume-restore-heartbeat <number>` | Internal durable resume restore heartbeat. |
| `--serve <boolean>` | Start an HTTP server alongside the workflow. Default: `false`. |
| `--port <number>` | HTTP server port when `--serve true`. Default: `7331`. |
| `--host <string>` | HTTP bind address when `--serve true`. Default: `127.0.0.1`. |
| `--auth-token <string>` | Bearer token for HTTP auth. Can also be provided through `SMITHERS_API_KEY`. |
| `--metrics <boolean>` | Expose `/metrics` Prometheus endpoint when `--serve true`. Default: `true`. |
| `--supervise <boolean>` | Run the stale-run supervisor loop (with `--serve`). Default: `false`. |
| `--supervise-dry-run <boolean>` | With `--supervise`, detect stale runs without resuming. Default: `false`. |
| `--supervise-interval <string>` | With `--supervise`, poll interval (e.g. `10s`, `30s`). Default: `"10s"`. |
| `--supervise-stale-threshold <string>` | With `--supervise`, stale heartbeat threshold. Default: `"30s"`. |
| `--supervise-max-concurrent <number>` | With `--supervise`, max runs resumed per poll. Default: `3`. |

```bash
bunx smithers-orchestrator up workflow.tsx --input '{"description":"Fix auth bug"}'
bunx smithers-orchestrator up workflow.tsx --run-id run-123 --resume true
bunx smithers-orchestrator up workflow.tsx --run-id run-123 --resume true --force true
bunx smithers-orchestrator up workflow.tsx --detach --input '{"description":"Deploy v2"}'
bunx smithers-orchestrator up workflow.tsx --serve --port 8080 --auth-token secret
bunx smithers-orchestrator workflow.tsx --input '{"description":"Direct file shorthand"}'
```

- Detached mode redirects stdout/stderr to a log file regardless of NDJSON logging settings.
- Exits with code `3` when the workflow pauses in `waiting-approval`.
- Serve mode starts the HTTP app and keeps the process alive until interrupted.

### tui

Open the interactive Smithers observability dashboard for the nearest `smithers.db`.

```bash
bunx smithers-orchestrator tui
```

There are no command-specific flags in the current help output. See the [TUI guide](/guides/tui) for the current keyboard model and feature reference.

### ps

List active, paused, and recent runs.

```bash
bunx smithers-orchestrator ps [options]
```

| Option | Description |
|---|---|
| `--status`, `-s <string>` | Filter: `running`, `waiting-approval`, `waiting-event`, `waiting-timer`, `continued`, `finished`, `failed`, `cancelled`. |
| `--limit`, `-l <number>` | Max rows. Default: `20`. |
| `--all`, `-a <boolean>` | Include all statuses. |
| `--watch`, `-w <boolean>` | Watch mode: refresh output continuously. Default: `false`. |
| `--interval`, `-i <number>` | Watch refresh interval in seconds. Default: `2`. |

```bash
bunx smithers-orchestrator ps
bunx smithers-orchestrator ps --status waiting-approval
bunx smithers-orchestrator ps --limit 50
bunx smithers-orchestrator ps --watch --interval 5
```

### logs

Tail lifecycle events for a run.

```bash
bunx smithers-orchestrator logs <runId> [options]
```

| Option | Description |
|---|---|
| `--follow`, `-f <boolean>` | Poll for new events while run is active. Default: `true`. |
| `--follow-ancestry <boolean>` | Include events from parent runs in the output, ordered root-to-current. Default: `false`. |
| `--since <number>` | Start from event sequence number. |
| `--tail`, `-n <number>` | Last `N` events. Default: `50`. |

```bash
bunx smithers-orchestrator logs run-123
bunx smithers-orchestrator logs run-123 --follow false
bunx smithers-orchestrator logs run-123 --since 400
bunx smithers-orchestrator logs run-123 --tail 10
bunx smithers-orchestrator logs run-123 --follow-ancestry true
```

When `--follow-ancestry` is enabled, events from the full parent chain are merged into a single stream with each line prefixed by the originating run ID.

Reads from the database, not detached-process stdout. See [Events](/runtime/events).

### events

Query run event history with filters, grouping, and NDJSON output.

```bash
bunx smithers-orchestrator events <runId> [options]
```

| Option | Description |
|---|---|
| `--node`, `-n <string>` | Filter events by node ID. |
| `--type`, `-t <string>` | Filter by event category: `agent`, `approval`, `frame`, `memory`, `node`, `openapi`, `output`, `rag`, `revert`, `run`, `sandbox`, `scorer`, `snapshot`, `supervisor`, `timer`, `token`, `tool-call`, `voice`, `workflow`. |
| `--since`, `-s <string>` | Filter to a recent duration window (e.g. `5m`, `2h`). |
| `--limit`, `-l <number>` | Maximum events to display. Default: `1000`. Max: `100000`. |
| `--json`, `-j <boolean>` | Output NDJSON for piping. Default: `false`. |
| `--group-by <string>` | Group output by `"node"` or `"attempt"`. |
| `--watch`, `-w <boolean>` | Watch mode: append new events as they arrive. Default: `false`. |
| `--interval`, `-i <number>` | Watch poll interval in seconds. Default: `2`. |

```bash
bunx smithers-orchestrator events run-123
bunx smithers-orchestrator events run-123 --node analyze --type tool-call
bunx smithers-orchestrator events run-123 --since 5m --limit 500
bunx smithers-orchestrator events run-123 --json
bunx smithers-orchestrator events run-123 --watch --interval 5
bunx smithers-orchestrator events run-123 --group-by node
```

### chat

Show agent transcripts for the latest or a specified run. Reconstructs chat from attempt metadata, `NodeOutput` events, and fallback `responseText`.

```bash
bunx smithers-orchestrator chat [runId] [options]
```

| Option | Description |
|---|---|
| `--all`, `-a <boolean>` | Show every agent attempt. Default: `false`. |
| `--follow`, `-f <boolean>` | Watch for new output. Default: `false`. |
| `--tail`, `-n <number>` | Last `N` chat blocks. |
| `--stderr <boolean>` | Include agent stderr. Default: `true`. |

```bash
bunx smithers-orchestrator chat
bunx smithers-orchestrator chat run-123
bunx smithers-orchestrator chat run-123 --all true
bunx smithers-orchestrator chat run-123 --tail 20
bunx smithers-orchestrator chat run-123 --follow true --stderr false
```

### inspect

Print structured run state.

```bash
bunx smithers-orchestrator inspect <runId> [options]
```

| Option | Description |
|---|---|
| `--watch`, `-w <boolean>` | Watch mode: refresh output continuously. Default: `false`. |
| `--interval`, `-i <number>` | Watch refresh interval in seconds. Default: `2`. |

Output includes: run metadata, step states, pending approvals, loop state, parsed runtime config, and error details.

```bash
bunx smithers-orchestrator inspect run-123
bunx smithers-orchestrator inspect run-123 --watch --interval 5
```

### node

Show enriched node details for debugging retries, tool calls, and output.

```bash
bunx smithers-orchestrator node <nodeId> [options]
```

| Option | Description |
|---|---|
| `--run-id`, `-r <string>` | Run ID containing the node. |
| `--iteration`, `-i <number>` | Loop iteration number. Default: latest iteration. |
| `--attempts <boolean>` | Expand all attempts in human output. Default: `false`. |
| `--tools <boolean>` | Expand tool input/output payloads in human output. Default: `false`. |
| `--watch`, `-w <boolean>` | Watch mode: refresh output continuously. Default: `false`. |
| `--interval <number>` | Watch refresh interval in seconds. Default: `2`. |

```bash
bunx smithers-orchestrator node analyze -r run-123
bunx smithers-orchestrator node analyze -r run-123 --attempts
bunx smithers-orchestrator node analyze -r run-123 --tools
bunx smithers-orchestrator node analyze -r run-123 --watch --interval 5
bunx smithers-orchestrator node analyze -r run-123 --iteration 2
```

### why

Explain why a run is currently blocked or paused.

```bash
bunx smithers-orchestrator why <runId> [options]
```

| Option | Description |
|---|---|
| `--json <boolean>` | Output structured JSON diagnosis. Default: `false`. |

```bash
bunx smithers-orchestrator why run-123
bunx smithers-orchestrator why run-123 --json
```

### scores

View scorer results for a specific run.

```bash
bunx smithers-orchestrator scores <runId> [options]
```

| Option | Description |
|---|---|
| `--node <string>` | Filter scores to a specific node ID. |

```bash
bunx smithers-orchestrator scores run-123
bunx smithers-orchestrator scores run-123 --node review
```

### approve

Approve a pending approval gate.

```bash
bunx smithers-orchestrator approve <runId> [options]
```

| Option | Description |
|---|---|
| `--node`, `-n <string>` | Node ID. Required when multiple gates are pending. |
| `--iteration <number>` | Loop iteration. Default: `0`. |
| `--note <string>` | Approval note. |
| `--by <string>` | Approver identifier. |

```bash
bunx smithers-orchestrator approve run-123 --node deploy --note "Looks good" --by alice
```

Recording approval does not resume execution. Resume with:

```bash
bunx smithers-orchestrator up workflow.tsx --run-id run-123 --resume true
```

### deny

Deny a pending approval gate. Same options as `approve`.

```bash
bunx smithers-orchestrator deny <runId> [options]
```

```bash
bunx smithers-orchestrator deny run-123 --node deploy --note "Rollback plan missing" --by bob
```

### signal

Deliver a durable signal to a run waiting on `<WaitForEvent>`.

```bash
bunx smithers-orchestrator signal <runId> <signalName> [options]
```

| Option | Description |
|---|---|
| `--data <string>` | Signal payload as JSON. Default: `{}`. |
| `--correlation <string>` | Correlation ID to match a specific waiter. |
| `--by <string>` | Name or identifier of the signal sender. |

```bash
bunx smithers-orchestrator signal run-123 payment-received --data '{"amount":100}'
bunx smithers-orchestrator signal run-123 approval --correlation txn-456 --by alice
```

### supervise

Watch for stale running runs and auto-resume them.

```bash
bunx smithers-orchestrator supervise [options]
```

| Option | Description |
|---|---|
| `--dry-run`, `-n <boolean>` | Show which stale runs would be resumed, without acting. Default: `false`. |
| `--interval`, `-i <string>` | Poll interval (e.g. `10s`, `30s`, `1m`). Default: `"10s"`. |
| `--stale-threshold`, `-t <string>` | Heartbeat staleness threshold before resume. Default: `"30s"`. |
| `--max-concurrent`, `-c <number>` | Max runs resumed per poll. Default: `3`. |

```bash
bunx smithers-orchestrator supervise
bunx smithers-orchestrator supervise --dry-run
bunx smithers-orchestrator supervise --interval 30s --stale-threshold 1m --max-concurrent 5
```

#### Supervisor behavior

Each poll cycle the supervisor:

1. Queries stale `running` runs whose heartbeat exceeds `--stale-threshold`.
2. Queries `waiting-timer` runs that have a timer past its fire time.
3. For each candidate, applies guards before resuming:
   - **Workflow existence**: skips the run if the workflow `.tsx` file no longer exists on disk.
   - **PID liveness**: skips the run if the runtime owner PID is still alive (the run is not actually stale).
   - **Claim acquisition**: attempts to claim the run for resume; skips if another supervisor already claimed it.
4. Stale runs are processed first, up to `--max-concurrent`. Timer-due runs fill remaining concurrency slots.
5. Runs that exceed the concurrency cap are rate-limited and retried on the next poll.

### cancel

Halt one active run. Marks in-progress attempts as cancelled.

```bash
bunx smithers-orchestrator cancel <runId>
```

```bash
bunx smithers-orchestrator cancel run-123
```

Exits with code `2` on success.

### down

Cancel all active runs in the nearest `smithers.db`.

```bash
bunx smithers-orchestrator down [options]
```

| Option | Description |
|---|---|
| `--force <boolean>` | Cancel stale runs too. Default: `false`. |

```bash
bunx smithers-orchestrator down
```

### hijack

Hand off the latest resumable agent session or conversation.

```bash
bunx smithers-orchestrator hijack <runId> [options]
```

| Option | Description |
|---|---|
| `--target <string>` | Expected engine: `claude-code` or `codex`. |
| `--timeout-ms <number>` | Wait time for live handoff. Default: `30000`. |
| `--launch <boolean>` | Open session immediately. Default: `true`. |

Two hijack styles: native session hijack (CLI agents with resumable session IDs) and conversation hijack (agents resuming from durable message history).

```bash
bunx smithers-orchestrator hijack run-123
bunx smithers-orchestrator hijack run-123 --target codex
bunx smithers-orchestrator hijack run-123 --launch false
```

- Cross-engine hijack is not supported.
- If a live hijack succeeds and the run has a workflow path, Smithers auto-resumes in detached mode.
- If auto-resume fails, prints the equivalent `bunx smithers-orchestrator up ... --resume true --run-id ...` command.
- Conversation hijack reconstructs agents from the original `.tsx` source.

### graph

Render a workflow tree without executing. Uses [renderFrame](/runtime/render-frame) internally.

```bash
bunx smithers-orchestrator graph <workflow> [options]
```

| Option | Description |
|---|---|
| `--run-id`, `-r <string>` | Run ID for persisted input/outputs. Default: `"graph"`. |
| `--input <string>` | Input JSON. Overrides persisted input. |

```bash
bunx smithers-orchestrator graph workflow.tsx --input '{"description":"Preview"}'
bunx smithers-orchestrator graph workflow.tsx --run-id run-123
```

### revert

Restore workspace to a previous task attempt snapshot. See [Revert](/runtime/revert).

```bash
bunx smithers-orchestrator revert <workflow> [options]
```

| Option | Description |
|---|---|
| `--run-id`, `-r <string>` | Run ID. |
| `--node-id`, `-n <string>` | Node ID. |
| `--attempt <number>` | Attempt number. Default: `1`. |
| `--iteration <number>` | Loop iteration. Default: `0`. |

```bash
bunx smithers-orchestrator revert workflow.tsx --run-id run-123 --node-id analyze --attempt 2
```

### retry-task

Retry a specific task within a run. Resets the target node (and optionally its dependents) to `pending`, clears their output rows, then resumes the workflow. Only the reset tasks re-execute; completed tasks use cached results.

```bash
bunx smithers-orchestrator retry-task <workflow> [options]
```

| Option | Description |
|---|---|
| `--run-id`, `-r <string>` | Run ID containing the task. |
| `--node-id`, `-n <string>` | Task/node ID to retry. |
| `--iteration <number>` | Loop iteration. Default: `0`. |
| `--no-deps <boolean>` | Only reset this node, not dependents. Default: `false`. |
| `--force <boolean>` | Allow retry even if run is still marked `running`. Default: `false`. |

```bash
bunx smithers-orchestrator retry-task workflow.tsx -r run-123 -n implement
bunx smithers-orchestrator retry-task workflow.tsx -r run-123 -n analyze --no-deps true
bunx smithers-orchestrator retry-task workflow.tsx -r run-123 -n stuck-task --force true
```

- Resets the target node and all downstream dependents to `pending` by default.
- Clears persisted output rows for reset nodes so the agent starts fresh.
- Cancels any prior failed/waiting attempts so retry budgets don't block re-execution.
- After resetting, automatically resumes the workflow with `resume: true`.

### timetravel

Time-travel to a previous task state: revert the filesystem via jj to an attempt's snapshot, reset the DB state, and optionally resume. Combines `revert` + node reset into one atomic operation.

```bash
bunx smithers-orchestrator timetravel <workflow> [options]
```

| Option | Description |
|---|---|
| `--run-id`, `-r <string>` | Run ID. |
| `--node-id`, `-n <string>` | Task/node ID to travel back to. |
| `--iteration <number>` | Loop iteration. Default: `0`. |
| `--attempt`, `-a <number>` | Attempt number. Default: latest. |
| `--no-vcs <boolean>` | Skip filesystem revert (DB only). Default: `false`. |
| `--no-deps <boolean>` | Only reset this node, not dependents. Default: `false`. |
| `--resume <boolean>` | Resume the workflow after time travel. Default: `false`. |
| `--force <boolean>` | Force even if run is still marked `running`. Default: `false`. |

```bash
bunx smithers-orchestrator timetravel workflow.tsx -r run-123 -n implement --resume true
bunx smithers-orchestrator timetravel workflow.tsx -r run-123 -n analyze -a 1 --no-vcs true
bunx smithers-orchestrator timetravel workflow.tsx -r run-123 -n step-3 --no-deps true --resume true
```

- Restores the filesystem to the attempt's `jjPointer` snapshot (unless `--no-vcs`).
- Deletes DB frames created after the target attempt.
- Resets the target node and dependents to `pending`.
- With `--resume`, immediately re-runs the workflow from the reverted state.

### replay

Fork from a checkpoint and resume execution (time travel). Creates a new run branched from a specific frame of an existing run, optionally resetting a node and overriding input, then immediately resumes execution.

```bash
bunx smithers-orchestrator replay <workflow> [options]
```

| Option | Description |
|---|---|
| `--run-id`, `-r <string>` | Source run ID to replay from. |
| `--frame`, `-f <number>` | Frame number to fork from. |
| `--node`, `-n <string>` | Node ID to reset to pending. |
| `--input`, `-i <string>` | Input overrides as JSON string. |
| `--label`, `-l <string>` | Branch label for the fork. |
| `--restore-vcs <boolean>` | Restore jj filesystem state to the source frame's revision. Default: `false`. |

```bash
bunx smithers-orchestrator replay workflow.tsx --run-id run-123 --frame 5
bunx smithers-orchestrator replay workflow.tsx -r run-123 -f 5 -n analyze --input '{"topic":"Rust"}'
bunx smithers-orchestrator replay workflow.tsx -r run-123 -f 5 --label experiment-1 --restore-vcs true
```

### diff

Compare two time-travel snapshots.

```bash
bunx smithers-orchestrator diff <a> <b> [options]
```

Arguments are snapshot refs in the format `run_id:frame_no` or `run_id` (uses latest frame).

| Option | Description |
|---|---|
| `--json <boolean>` | Output as JSON. Default: `false`. |

```bash
bunx smithers-orchestrator diff run-123:5 run-123:10
bunx smithers-orchestrator diff run-123 run-456
bunx smithers-orchestrator diff run-123:5 run-456:3 --json
```

### fork

Create a branched run from a snapshot checkpoint (time travel). Unlike `replay`, `fork` does not automatically resume execution unless `--run true` is passed.

```bash
bunx smithers-orchestrator fork <workflow> [options]
```

| Option | Description |
|---|---|
| `--run-id`, `-r <string>` | Source run ID. |
| `--frame`, `-f <number>` | Frame number to fork from. |
| `--reset-node`, `-n <string>` | Node ID to reset to pending. |
| `--input`, `-i <string>` | Input overrides as JSON string. |
| `--label`, `-l <string>` | Branch label. |
| `--run <boolean>` | Immediately start the forked run. Default: `false`. |

```bash
bunx smithers-orchestrator fork workflow.tsx --run-id run-123 --frame 5
bunx smithers-orchestrator fork workflow.tsx -r run-123 -f 5 -n analyze --label fix-branch
bunx smithers-orchestrator fork workflow.tsx -r run-123 -f 5 --run true
```

### timeline

View execution timeline for a run and its forks (time travel).

```bash
bunx smithers-orchestrator timeline <runId> [options]
```

| Option | Description |
|---|---|
| `--tree <boolean>` | Include all child forks recursively. Default: `false`. |
| `--json <boolean>` | Output as JSON. Default: `false`. |

```bash
bunx smithers-orchestrator timeline run-123
bunx smithers-orchestrator timeline run-123 --tree
bunx smithers-orchestrator timeline run-123 --json
```

### observability

Start or stop the local observability stack (Docker Compose).

```bash
bunx smithers-orchestrator observability [options]
```

| Option | Description |
|---|---|
| `--detach`, `-d <boolean>` | Background containers. Default: `false`. |
| `--down <boolean>` | Stop and remove stack. Default: `false`. |

```bash
bunx smithers-orchestrator observability
bunx smithers-orchestrator observability --detach
bunx smithers-orchestrator observability --down
```

Prints Grafana, Prometheus, and Tempo endpoints on success. See [Monitoring & Logs](/guides/monitoring-logs).

### ask

Query via the best available installed agent CLI.

```bash
bunx smithers-orchestrator ask "How do I resume an approval-gated run?"
```

Supported agents: Claude, Codex, Gemini, Kimi, and PI. The command bootstraps a temporary MCP config, exposes Smithers as a local MCP server (`bunx smithers-orchestrator --mcp`), and delegates the question. Exits with error if no supported agent is installed.

| Option | Description |
|---|---|
| `--agent <string>` | Force a specific agent: `claude`, `codex`, `gemini`, `kimi`, or `pi`. Default: auto-select best available. |
| `--tool-surface <string>` | MCP tool surface to expose: `semantic` or `raw`. Default: `semantic`. |
| `--no-mcp <boolean>` | Disable MCP bootstrap; pass the question as a plain prompt. Default: `false`. |
| `--print-bootstrap <boolean>` | Print the bootstrap configuration (agent, mode, tool surface) instead of running. Default: `false`. |
| `--dump-prompt <boolean>` | Print the full system prompt that would be sent to the agent. Default: `false`. |
| `--list-agents <boolean>` | List all available agents with usability status and bootstrap mode. Default: `false`. |

MCP bootstrap mode is selected automatically per agent: `mcp-config-file` for Claude Code and Kimi, `mcp-config-inline` for Codex, `mcp-allow-list` for Gemini, and `prompt-only` for PI. Use `--no-mcp` to disable MCP entirely.

```bash
bunx smithers-orchestrator ask "How do I resume an approval-gated run?"
bunx smithers-orchestrator ask --agent codex "Explain the workflow graph"
bunx smithers-orchestrator ask --tool-surface raw "List the raw Smithers MCP tools"
bunx smithers-orchestrator ask --list-agents
bunx smithers-orchestrator ask --print-bootstrap "test query"
bunx smithers-orchestrator ask --dump-prompt "test query"
```

### agents

Inspect Smithers' built-in CLI agent capability registries.

```bash
bunx smithers-orchestrator agents <subcommand> [options]
```

| Subcommand | Purpose |
|---|---|
| `agents capabilities` | Print the full capability registry for all built-in CLI agents. |
| `agents doctor` | Validate capability metadata for drift or internal contradictions. |

#### agents capabilities

Print a JSON report of all built-in CLI agent capability registries. Use this to understand which CLI-backed agents Smithers knows about and what features each one advertises.

```bash
bunx smithers-orchestrator agents capabilities
```

Output is always JSON.

#### agents doctor

Validate the built-in CLI agent capability registries for drift or contradictions. Exits with code `0` when the registry is internally consistent, `1` when problems are detected.

```bash
bunx smithers-orchestrator agents doctor [options]
```

| Option | Description |
|---|---|
| `--json <boolean>` | Print the doctor report as JSON instead of human-readable output. Default: `false`. |

```bash
bunx smithers-orchestrator agents doctor
bunx smithers-orchestrator agents doctor --json
```

Use `agents doctor` in CI to catch registry drift before deployment.

### human

List and resolve durable human requests generated by `<HumanTask>` nodes.

```bash
bunx smithers-orchestrator human <action> [requestId] [options]
```

| Subcommand | Purpose |
|---|---|
| `human inbox` | List all pending human requests across runs. |
| `human answer <requestId>` | Submit a JSON response to a pending request. |
| `human cancel <requestId>` | Cancel a pending request without answering it. |

`<HumanTask>` nodes pause workflow execution and record a durable request in the database. This command is the operational surface for that inbox. It is more general than `approve`/`deny`: a human request can carry structured data of any shape, not just a binary gate decision.

#### human inbox

```bash
bunx smithers-orchestrator human inbox
```

Lists all pending human requests. Output includes: `requestId`, `runId`, `workflowName`, `nodeId`, `kind`, `prompt`, `status`, `requestedAt`, `age`, and `timeoutAtMs`.

Pass `--format json` for structured output:

```bash
bunx smithers-orchestrator human inbox --format json
```

#### human answer

Submit a response to a pending human request. The workflow resumes after the answer is recorded.

```bash
bunx smithers-orchestrator human answer <requestId> [options]
```

| Option | Description |
|---|---|
| `--value <string>` | Response as a JSON string. Required. |
| `--by <string>` | Name or identifier of the operator answering. |

```bash
bunx smithers-orchestrator human answer req-abc123 --value '{"approved":true}' --by "alice"
bunx smithers-orchestrator human answer req-abc123 --value '{"choice":"option-a","notes":"Looks good"}' --by "alice"
```

If the `<HumanTask>` node has a corresponding approval gate, `answer` automatically bridges the approval as well — no separate `bunx smithers-orchestrator approve` call is needed.

Exits with code `4` if the request is not found, not pending, or has already expired.

#### human cancel

Cancel a pending human request without providing an answer.

```bash
bunx smithers-orchestrator human cancel <requestId> [options]
```

| Option | Description |
|---|---|
| `--by <string>` | Name or identifier of the operator cancelling. |

```bash
bunx smithers-orchestrator human cancel req-abc123 --by "alice"
```

If a corresponding approval gate exists, it is automatically denied with the cancellation reason. The workflow will resume and handle the cancelled state according to its error handling logic.

### alerts

List, acknowledge, resolve, or silence durable alert instances.

```bash
bunx smithers-orchestrator alerts <action> [alertId] [options]
```

| Action | Purpose |
|---|---|
| `alerts list` | List active alerts from the nearest `smithers.db`. |
| `alerts ack <alertId>` | Mark an alert as acknowledged. |
| `alerts resolve <alertId>` | Mark an alert as resolved. |
| `alerts silence <alertId>` | Silence an alert without resolving it. |

The current help output exposes no alert-specific flags. `ack`, `resolve`, and `silence` require an alert ID.

```bash
bunx smithers-orchestrator alerts list
bunx smithers-orchestrator alerts ack alert-123
bunx smithers-orchestrator alerts resolve alert-123
bunx smithers-orchestrator alerts silence alert-123
```

### memory

Inspect cross-run memory facts and perform semantic recall.

```bash
bunx smithers-orchestrator memory <subcommand> [options]
```

| Subcommand | Purpose |
|---|---|
| `memory list <namespace>` | List all stored facts in a namespace. |
| `memory recall <query>` | Search memory by semantic similarity. |

Both subcommands require `--workflow` to locate the `smithers.db` for the target workflow.

#### memory list

List all facts stored in a namespace.

```bash
bunx smithers-orchestrator memory list <namespace> [options]
```

Arguments:

- `namespace`: Namespace to inspect (e.g. `workflow:my-flow`, `global:default`).

| Option | Description |
|---|---|
| `--workflow`, `-w <string>` | Path to a `.tsx` workflow file. Required to locate the database. |

```bash
bunx smithers-orchestrator memory list workflow:implement -w .smithers/workflows/implement.tsx
bunx smithers-orchestrator memory list global:default -w .smithers/workflows/implement.tsx
```

#### memory recall

Search stored facts by semantic similarity using vector embeddings.

```bash
bunx smithers-orchestrator memory recall <query> [options]
```

Arguments:

- `query`: Natural language search query.

| Option | Description |
|---|---|
| `--workflow`, `-w <string>` | Path to a `.tsx` workflow file. Required to locate the database. |
| `--namespace`, `-n <string>` | Namespace to search within. Default: `global:default`. |
| `--top-k`, `-k <number>` | Number of results to return. Default: `5`. |

```bash
bunx smithers-orchestrator memory recall "auth bug fixes" -w .smithers/workflows/implement.tsx
bunx smithers-orchestrator memory recall "database migrations" -w .smithers/workflows/implement.tsx --namespace workflow:implement --top-k 10
```

See [Cross-Run Memory](/concepts/memory) and [Memory Quickstart](/guides/memory-quickstart) for the runtime model.

### openapi

Preview the AI SDK tools that Smithers would generate from an OpenAPI spec.

```bash
bunx smithers-orchestrator openapi <subcommand>
```

| Subcommand | Purpose |
|---|---|
| `openapi list <specPath>` | List all operations parsed from a spec as named tools. |

#### openapi list

Parse an OpenAPI spec and print the tool names and summaries that Smithers would generate from it.

```bash
bunx smithers-orchestrator openapi list <specPath>
```

Arguments:

- `specPath`: File path or URL to an OpenAPI spec (JSON or YAML).

```bash
bunx smithers-orchestrator openapi list ./api/openapi.yaml
bunx smithers-orchestrator openapi list https://petstore3.swagger.io/api/v3/openapi.json
```

Output lists one tool per operation: `operationId — summary (or method + path)`. The count of tools is printed at the end.

Use this to audit a spec before wiring it into a workflow with `openApiTools()`.

See [OpenAPI Tools](/concepts/openapi-tools) and [OpenAPI Tools Quickstart](/guides/openapi-tools-quickstart) for the authoring model.

### rag

Ingest documents into the RAG store or query it for relevant chunks.

```bash
bunx smithers-orchestrator rag <subcommand> [options]
```

| Subcommand | Purpose |
|---|---|
| `rag ingest <file>` | Chunk, embed, and store a document in the vector store. |
| `rag query <query>` | Retrieve the most relevant chunks for a search query. |

Both subcommands require `--workflow` to locate the `smithers.db` for the target workflow.

#### rag ingest

Chunk and embed a document into the SQLite vector store.

```bash
bunx smithers-orchestrator rag ingest <file> [options]
```

Arguments:

- `file`: Path to the file to ingest.

| Option | Description |
|---|---|
| `--workflow`, `-w <string>` | Path to a `.tsx` workflow file. Required to locate the database. |
| `--namespace`, `-n <string>` | Vector namespace. Default: `default`. |
| `--strategy <string>` | Chunking strategy: `recursive`, `character`, `sentence`, `markdown`, `token`. Default: `recursive`. |
| `--size <number>` | Chunk size in tokens or characters. Default: `1000`. |
| `--overlap <number>` | Chunk overlap. Default: `200`. |

```bash
bunx smithers-orchestrator rag ingest ./docs/api-reference.md -w .smithers/workflows/implement.tsx
bunx smithers-orchestrator rag ingest ./data/knowledge-base.txt -w .smithers/workflows/implement.tsx --namespace kb --strategy markdown
bunx smithers-orchestrator rag ingest ./src/README.md -w .smithers/workflows/implement.tsx --size 500 --overlap 100
```

#### rag query

Search the vector store for chunks most relevant to a query.

```bash
bunx smithers-orchestrator rag query <query> [options]
```

Arguments:

- `query`: Natural language search query.

| Option | Description |
|---|---|
| `--workflow`, `-w <string>` | Path to a `.tsx` workflow file. Required to locate the database. |
| `--namespace`, `-n <string>` | Vector namespace to search. Default: `default`. |
| `--top-k`, `-k <number>` | Number of results to return. Default: `5`. |

```bash
bunx smithers-orchestrator rag query "how does authentication work" -w .smithers/workflows/implement.tsx
bunx smithers-orchestrator rag query "database connection pooling" -w .smithers/workflows/implement.tsx --namespace kb --top-k 10
```

See [RAG](/concepts/rag) and [RAG Quickstart](/guides/rag-quickstart) for the retrieval model.

### workflow

Manage flat workflows in `.smithers/workflows/*.tsx`.

```bash
bunx smithers-orchestrator workflow <command>
```

| Command | Purpose |
|---|---|
| `workflow run <name>` | Run a discovered workflow by ID. |
| `workflow` | List discovered workflows (no subcommand). |
| `workflow list` | List discovered workflows. |
| `workflow path <name>` | Resolve workflow ID to entry file. |
| `workflow create <name>` | Create a new workflow scaffold. |
| `workflow doctor [name]` | Report discovery, preload, bunfig, and detected agents. |

Run shorthands:

```bash
bunx smithers-orchestrator workflow implement --prompt "Add input validation"
bunx smithers-orchestrator workflow run implement --prompt "Add input validation"
```

Shorthand resolution:
1. Resolve name from `.smithers/workflows/<name>.tsx`
2. Rewrite to `bunx smithers-orchestrator workflow run <name>`

```bash
bunx smithers-orchestrator workflow
bunx smithers-orchestrator workflow list
bunx smithers-orchestrator workflow run implement --prompt "Investigate flaky tests"
bunx smithers-orchestrator workflow path implement
bunx smithers-orchestrator workflow doctor implement
bunx smithers-orchestrator workflow create foo
bunx smithers-orchestrator workflow foo --prompt "Investigate flaky tests"
```

- Names: lowercase letters, numbers, hyphens only.
- `workflow create` writes only the workflow file. Run `bunx smithers-orchestrator init` first for `.smithers/agents.ts` and `.smithers/components/`.

#### Workflow metadata comments

Discovery extracts optional metadata from `//` comments in the first six lines of each workflow file:

```ts
// smithers-source: generated
// smithers-display-name: Code Review
```

| Marker | Values | Purpose |
|---|---|---|
| `smithers-source` | `seeded`, `generated`, `user` | Origin classification shown in `workflow list` and `workflow doctor`. |
| `smithers-display-name` | Free text | Human-readable name for the workflow. |

`workflow create` automatically inserts both markers.

#### workflow list

List all discovered workflows under `.smithers/workflows/`. Output includes each workflow's ID, entry file path, and source type.

```bash
bunx smithers-orchestrator workflow list
```

```bash
bunx smithers-orchestrator workflow list
# implement  .smithers/workflows/implement.tsx  user
# sweep      .smithers/workflows/sweep.tsx      generated
```

#### workflow run

Run a discovered workflow by ID.

```bash
bunx smithers-orchestrator workflow run <name> [options]
```

Arguments:

- `name`: workflow ID

`workflow run` accepts all of the same execution options as [`up`](#up), plus:

| Option | Description |
|---|---|
| `--prompt`, `-p <string>` | Shorthand for setting `input.prompt` when `--input` is omitted. |

Workflow-specific behavior:

- resolves `<name>` to `.smithers/workflows/<name>.tsx`
- defaults `--root` to `.` when not provided
- preserves all other `up` semantics including `--resume`, `--detach`, `--serve`, and hot reload

#### workflow path

Resolve a workflow ID to its entry file path. Prints the absolute path, ID, and source type.

```bash
bunx smithers-orchestrator workflow path <name>
```

Arguments:

- `name`: Workflow ID to resolve.

```bash
bunx smithers-orchestrator workflow path implement
# /home/user/project/.smithers/workflows/implement.tsx
```

#### workflow create

Create a new workflow scaffold file in `.smithers/workflows/`. The name must contain only lowercase letters, numbers, and hyphens. Exits with code `4` if the name is invalid or the file already exists.

```bash
bunx smithers-orchestrator workflow create <name>
```

Arguments:

- `name`: ID for the new workflow.

```bash
bunx smithers-orchestrator workflow create my-new-flow
# Created .smithers/workflows/my-new-flow.tsx
```

Run `bunx smithers-orchestrator init` first to ensure `.smithers/agents.ts` and `.smithers/components/` exist.

#### workflow doctor

Inspect workflow discovery health. Reports the workflow root, discovered workflows, preload file status, bunfig status, and detected CLI agents. When `name` is provided, scopes the report to that single workflow.

```bash
bunx smithers-orchestrator workflow doctor [name]
```

Arguments:

- `name` (optional): Workflow ID to inspect. Omit to inspect all discovered workflows.

```bash
bunx smithers-orchestrator workflow doctor
bunx smithers-orchestrator workflow doctor implement
```

### cron

Background schedule triggers stored in the nearest `smithers.db`.

```bash
bunx smithers-orchestrator cron <command>
```

| Command | Purpose |
|---|---|
| `cron start` | Start the scheduler loop. |
| `cron add <pattern> <workflowPath>` | Add a cron entry. |
| `cron list` | List registered entries. |
| `cron rm <cronId>` | Delete an entry. |

```bash
bunx smithers-orchestrator cron add "0 * * * *" .smithers/workflows/implement.tsx
bunx smithers-orchestrator cron list
bunx smithers-orchestrator cron start
bunx smithers-orchestrator cron rm 0d3a8b0f-6f1c-4e2b-9f4d-0a7b7b95c2e1
```

- Polls every 15 seconds.
- Due jobs launch as detached `bunx smithers-orchestrator up ... -d` processes.
- `workflowPath` is replayed through `bunx smithers-orchestrator up`; use an explicit file path.

#### cron start

Start the background scheduler loop in the current terminal. The loop polls every 15 seconds and launches due jobs as detached `bunx smithers-orchestrator up ... -d` processes. Runs until interrupted.

```bash
bunx smithers-orchestrator cron start
```

#### cron add

Register a new workflow cron schedule. Returns the generated `cronId`.

```bash
bunx smithers-orchestrator cron add <pattern> <workflowPath>
```

Arguments:

- `pattern`: Cron expression (e.g. `"0 * * * *"` for hourly, `"*/5 * * * *"` for every 5 minutes).
- `workflowPath`: Path or ID of the workflow to schedule.

```bash
bunx smithers-orchestrator cron add "0 9 * * *" .smithers/workflows/sweep.tsx
bunx smithers-orchestrator cron add "*/30 * * * *" .smithers/workflows/implement.tsx
```

#### cron list

List all registered background cron schedules from `smithers.db`. Output includes `cronId`, pattern, workflow path, enabled state, and last/next run timestamps.

```bash
bunx smithers-orchestrator cron list
```

#### cron rm

Delete an existing cron schedule by its ID.

```bash
bunx smithers-orchestrator cron rm <cronId>
```

Arguments:

- `cronId`: UUID of the cron entry to remove (from `cron list` output).

```bash
bunx smithers-orchestrator cron rm 0d3a8b0f-6f1c-4e2b-9f4d-0a7b7b95c2e1
```

## Framework-Provided Built-Ins

### completions

```bash
bunx smithers-orchestrator completions bash
bunx smithers-orchestrator completions zsh
bunx smithers-orchestrator completions fish
bunx smithers-orchestrator completions nushell
```

| Shell | Setup |
|---|---|
| `bash` | `eval "$(bunx smithers-orchestrator completions bash)"` |
| `zsh` | `eval "$(bunx smithers-orchestrator completions zsh)"` |
| `fish` | `bunx smithers-orchestrator completions fish \| source` |
| `nushell` | Add output of `bunx smithers-orchestrator completions nushell` to `config.nu` |

### mcp add

Register Smithers as an MCP server for an agent integration.

```bash
bunx smithers-orchestrator mcp add [options]
```

| Option | Description |
|---|---|
| `--command`, `-c <cmd>` | Override the agent command. |
| `--no-global` | Project-local install. |
| `--agent <agent>` | Target agent (`claude-code`, `cursor`, etc.). |

```bash
bunx smithers-orchestrator mcp add
bunx smithers-orchestrator mcp add --agent claude-code
bunx smithers-orchestrator mcp add --command "bunx smithers-orchestrator --mcp"
```

### skills

Sync skill files to agent integrations.

```bash
bunx smithers-orchestrator skills <command>
```

| Command | Description |
|---|---|
| `skills add [options]` | Sync skill files to agents. |
| `skills list` | List available skills. |

`skills` also has the alias `skill`.

#### skills add

```bash
bunx smithers-orchestrator skills add [options]
```

| Option | Description |
|---|---|
| `--depth <number>` | Grouping depth. Default: `1`. |
| `--no-global` | Project-local install. |

```bash
bunx smithers-orchestrator skills add
bunx smithers-orchestrator skills add --depth 2
bunx smithers-orchestrator skills add --no-global
```

#### skills list

```bash
bunx smithers-orchestrator skills list
```

## Global Options

| Option | Description |
|---|---|
| `--format <toon\|json\|yaml\|md\|jsonl>` | Output format. |
| `--verbose` | Full output envelope. |
| `--filter-output <keys>` | Filter JSON by key path (e.g. `runs[0].id`). |
| `--schema` | Print JSON schema for the command. |
| `--llms`, `--llms-full` | Print LLM-readable manifest. |
| `--token-count` | Print token count instead of output. |
| `--token-limit <n>` | Limit output to `n` tokens. |
| `--token-offset <n>` | Skip first `n` output tokens. |
| `--help` | Command help. |
| `--version` | CLI version. |
| `--mcp` | Start as MCP stdio server. Used by `bunx smithers-orchestrator ask` and external tooling. |

## Operational Behavior

### Progress reporting

During `up` execution, the CLI writes progress events to stderr with elapsed timestamps:

```
[00:00:02] → analyze (attempt 1, iteration 0)
[00:00:45] ✓ analyze (attempt 1)
[00:00:45] → implement (attempt 1, iteration 0)
[00:01:30] ✗ implement (attempt 1): tool timeout exceeded
[00:01:30] ↻ implement retrying (attempt 2)
[00:02:15] ✓ implement (attempt 2)
[00:02:15] ⏱ deploy waiting for timer (fires 2026-04-09T12:00:00Z)
```

Progress includes node lifecycle (`→ ✓ ✗ ↻`), timer events (`⏱`), hot-reload status, and run completion.

### Signal handling

The CLI listens for `SIGINT` and `SIGTERM`. On receipt it:

1. Writes `[smithers] received <signal>, cancelling run...` to stderr.
2. Fires the abort controller to cancel in-flight operations.
3. Closes the SQLite database connection.
4. Exits with code `130` (SIGINT) or `143` (SIGTERM).

Signals are handled at most once; a second signal during shutdown force-kills the process.

### SQLite cleanup

On process exit, SIGINT, or SIGTERM the CLI closes the SQLite connection via `client.close()`. This ensures WAL checkpointing and prevents database corruption from unclean shutdowns.

### Event formatting

`logs` and `events` render each event line as:

```
[+MM:SS.mmm] <symbol> <summary>
```

Symbols are color-coded: `✓` green for success, `✗` red for failure, `⏸` for approvals, `⏱` for timers, `🔧` for tool calls, `📚` for RAG, `🧠` for memory. Payloads are truncated to 240 characters by default.

## Exit Codes

| Code | Meaning |
|---|---|
| `0` | Success. |
| `1` | Execution failure. |
| `2` | Run cancelled / `cancel` succeeded. |
| `3` | `up` completed in `waiting-approval`. |
| `4` | Invalid arguments or user-correctable input error. |
| `130` | Interrupted by SIGINT. |
| `143` | Terminated by SIGTERM. |

## Related

- [runWorkflow](/runtime/run-workflow) -- programmatic runtime API
- [Events](/runtime/events) -- persisted lifecycle events
- [Revert](/runtime/revert) -- filesystem snapshot restoration
- [Hot Reload](/guides/hot-reload) -- `up --hot true`
- [Approvals](/concepts/approvals) -- `approve` and `deny`
- [Time Travel](/concepts/time-travel) -- snapshots, forking, and replay

---

## Workflows Overview

> What Smithers workflows are, when to use them, and the core principles behind the system.
> Source: https://smithers.sh/concepts/workflows-overview

You have a problem. You asked an AI agent to review a codebase, apply fixes, run the tests, and write a summary. It started well, then hallucinated a file path, lost track of which fixes it already applied, and — when your laptop went to sleep — forgot everything.

You could wrap the agent call in a retry loop and pray. Or you could break the problem into pieces, give each piece a name, a schema, and an execution order, and let the machine handle the rest.

That's what a Smithers workflow is: a typed, resumable execution plan for multi-step AI work.

## When to Use Workflows

Ask yourself: does this job need more than one step, and does the order matter?

If you have a single prompt that needs one LLM call, a workflow is overkill. But the moment you need coordination — analyze, then fix, then validate, then report — you need answers to questions that ad hoc scripts dodge:

- Which [agent](/concepts/agents-and-tools) handles each step?
- How does [data](/concepts/workflow-state) flow between steps?
- Which steps can run in [parallel](/components/parallel)?
- Where does a human need to [approve](/concepts/approvals) something?
- What happens when a step [fails](/guides/error-handling) at 2 AM?

Workflows give you structure for all of this. And because every completed step is persisted to [SQLite](https://sqlite.org), you get durability for free.

## Core Principles

Three ideas, in order:

1. **Define [tasks](/components/task)** as [JSX components](/jsx/overview) with typed input/output schemas
2. **Compose tasks** using control-flow primitives ([`<Sequence>`](/components/sequence), [`<Parallel>`](/components/parallel), [`<Branch>`](/components/branch), [`<Loop>`](/components/loop))
3. **Run workflows** with built-in persistence, [resumability](/concepts/suspend-and-resume), [approval gates](/concepts/approvals), and streaming

That's the whole framework. Everything else follows from these three.

## Building Blocks

### Tasks

A [`<Task>`](/components/task) is the smallest unit of work. It has an `id`, an `output` schema, and one of three modes. The simplest way to see the difference is to look at all three:

```tsx
// Agent mode — send a prompt to an AI agent
<Task id="analyze" output={outputs.analysis} agent={claude}>
  {`Analyze the codebase in ${ctx.input.repo}`}
</Task>

// Compute mode — run a function at execution time
<Task id="validate" output={outputs.validation}>
  {async () => {
    const result = await $`bun test`.quiet();
    return { passed: result.exitCode === 0 };
  }}
</Task>

// Static mode — write a value directly
<Task id="config" output={outputs.config}>
  {{ environment: "production", debug: false }}
</Task>
```

Agent mode sends a prompt to an LLM. Compute mode runs arbitrary code. Static mode writes a literal value. Every other feature — retries, validation, deps — layers on top of these three modes.

### Control Flow

"But why not just write `await step1(); await step2();`?"

You could. But then you lose resumability, parallelism, and conditional branching — and you're back to the ad hoc script. These four primitives give you the same expressiveness with none of the bookkeeping:

| Component | Purpose | Behavior |
| --- | --- | --- |
| [`<Sequence>`](/components/sequence) | Run tasks one after another | Each child waits for the previous to complete |
| [`<Parallel>`](/components/parallel) | Run tasks concurrently | All children start together (respecting concurrency limits) |
| [`<Branch>`](/components/branch) | Choose one path | Evaluates a condition and runs `then` or `else` |
| [`<Loop>`](/components/loop) | Repeat until a condition | Re-executes children each iteration until `until` is true |

Four components. That's the entire control-flow vocabulary. See [Control Flow](/concepts/control-flow) for detailed guidance on each primitive.

### Schemas

You might be wondering: how does Smithers know if an agent returned useful output or nonsense?

Every task declares what it produces using a [Zod](https://zod.dev) schema. Smithers validates the agent's output against that schema automatically. If validation fails, the agent is retried with the error as feedback — no manual wrangling required.

```tsx
const { Workflow, smithers, outputs } = createSmithers({
  analysis: z.object({
    summary: z.string(),
    risk: z.enum(["low", "medium", "high"]),
  }),
  fix: z.object({
    filesChanged: z.array(z.string()),
    description: z.string(),
  }),
});
```

The `outputs` object is type-checked at compile time. Write `outputs.analysis` incorrectly and the compiler catches it — not your production logs at midnight.

## A Complete Workflow

Here is a workflow that analyzes code, optionally fixes issues, and writes a report:

```tsx
/** @jsxImportSource smithers-orchestrator */
import { createSmithers, Sequence, Branch, Task } from "smithers-orchestrator";
import { z } from "zod";

const { Workflow, smithers, outputs } = createSmithers({
  analysis: z.object({
    summary: z.string(),
    hasIssues: z.boolean(),
    issues: z.array(z.string()),
  }),
  fix: z.object({ filesChanged: z.array(z.string()) }),
  report: z.object({ title: z.string(), body: z.string() }),
});

export default smithers((ctx) => {
  const analysis = ctx.outputMaybe(outputs.analysis, { nodeId: "analyze" });

  return (
    <Workflow name="code-review">
      <Sequence>
        <Task id="analyze" output={outputs.analysis} agent={reviewer}>
          {`Analyze: ${ctx.input.repo}`}
        </Task>

        {analysis ? (
          <>
            <Branch
              if={analysis.hasIssues}
              then={
                <Task id="fix" output={outputs.fix} agent={coder} deps={{ analyze: outputs.analysis }}>
                  {(deps) => `Fix these issues: ${deps.analyze.issues.join(", ")}`}
                </Task>
              }
            />

            <Task id="report" output={outputs.report} deps={{ analyze: outputs.analysis }}>
              {(deps) => ({
                title: `Review of ${ctx.input.repo}`,
                body: deps.analyze.summary,
              })}
            </Task>
          </>
        ) : null}
      </Sequence>
    </Workflow>
  );
});
```

Read the code before the explanation — most of it should be clear from the [JSX](/jsx/overview) alone.

A few things worth calling out:

- **Typed schemas** define what each task produces. No ambiguity about shape.
- **Sequential execution** via [`<Sequence>`](/components/sequence) ensures `analyze` finishes before anything downstream runs.
- **[Typed handoff](/concepts/workflow-state)** via `deps={{ ... }}` gives downstream tasks direct access to upstream output — no prompt-plumbing boilerplate.
- **Render-time branching** via [`ctx.outputMaybe(...)`](/concepts/workflow-state) handles the case where the analysis hasn't run yet (first render) versus when it has (subsequent renders).
- **Conditional logic** with [`<Branch>`](/components/branch) skips the fix step entirely when there are no issues.

That last point is the key insight: the JSX tree re-renders as tasks complete, and each render can produce a different tree based on what's known so far.

## How Execution Works

The workflow runs in a loop — think of it like [React's render cycle](https://react.dev/learn/preserving-and-resetting-state), but for task orchestration:

1. **Render** — Smithers renders the JSX tree with the current context
2. **Extract** — It finds executable tasks from the rendered tree
3. **Execute** — Ready tasks run (agent calls, functions, or static writes)
4. **Persist** — Outputs are validated and written to SQLite
5. **Repeat** — The tree re-renders with updated context until all tasks complete

Each render can produce a different set of ready tasks because branching and `outputMaybe` respond to what's already been computed. This is the high-level cycle. For the full internal model, see [Execution Model](/concepts/execution-model).

## Durability

Here is where workflows earn their keep over a chain of `await` calls.

Every completed task writes its output to SQLite immediately. If the process crashes:

- Completed tasks are never re-run
- The workflow resumes from the last incomplete task
- Approval gates survive restarts
- Loop iteration state is preserved

Your laptop can go to sleep. Your server can reboot. An hour-long agent workflow picks up where it left off. See [Suspend and Resume](/concepts/suspend-and-resume) for the full durability model.

## Running Workflows

### From the CLI

```bash
# Start a new run
smithers up workflow.tsx --input '{"repo": "/my-project"}'

# Resume after a crash
smithers up workflow.tsx --run-id abc123 --resume true

# Check status
smithers inspect abc123
```

### Programmatically

```ts
import { runWorkflow } from "smithers-orchestrator";
import workflow from "./workflow";

const result = await runWorkflow(workflow, {
  input: { repo: "/my-project" },
});

if (result.status === "finished") {
  console.log(result.output);
}
```

## Result Statuses

A workflow run resolves to one of these statuses:

| Status | Meaning |
| --- | --- |
| `finished` | All tasks completed successfully |
| `failed` | A task failed after exhausting retries |
| `waiting-approval` | Paused at an approval gate |
| `cancelled` | Stopped by the user or runtime |

## Next Steps

- [JSX Quickstart](/jsx/quickstart) — Build your first workflow hands-on.
- [Control Flow](/concepts/control-flow) — Learn the four control-flow primitives and when to use each.
- [Workflow State](/concepts/workflow-state) — Understand how data flows between tasks.
- [Agents and Tools](/concepts/agents-and-tools) — Add agent reasoning and sandboxed tools to tasks.
- [Execution Model](/concepts/execution-model) — See how render, execute, persist, and resume fit together.

---

## Execution Model

> How Smithers turns a JSX workflow into a durable render-execute-persist loop.
> Source: https://smithers.sh/concepts/execution-model

Here's what actually happens when you hit run.

You hand Smithers a [JSX](/jsx/overview) tree. It doesn't execute anything yet. It _renders_ the tree into a [frame](/runtime/render-frame) -- a snapshot of what work exists right now. Then it finds the tasks that are ready, runs them, writes the results to [SQLite](https://sqlite.org), and renders again. That loop continues until there's nothing left to do.

That's the whole model. Everything else is detail.

But the details matter, because three properties fall out of this design that you'd otherwise have to build yourself:

- Workflow definitions stay declarative. You write what the work _is_, not how to schedule it.
- Execution state is durable. Crash, restart, resume -- the database is the truth.
- Branching and loops are data-driven. No hidden scheduler magic, no opaque state machines.

Let's walk through one complete turn of the loop.

## The Core Loop

Every run follows the same five-phase cycle:

```txt
1. Render   build a frame from JSX + current context
2. Extract  turn mounted nodes into task descriptors
3. Schedule find ready tasks whose dependencies are satisfied
4. Execute  run agent, compute, or static tasks
5. Persist  write outputs, attempts, events, and frame state
           -> render again until terminal
```

[`runWorkflow(...)`](/runtime/run-workflow) drives that loop until the run finishes, fails, is cancelled, or pauses for [approval](/concepts/approvals).

If you've ever watched a [React app re-render after a state change](https://react.dev/learn/preserving-and-resetting-state), you already have the intuition. The difference is that Smithers' "state" lives in a database, and its "side effects" are agent calls and compute functions instead of DOM updates.

## Phase 1: Render a Frame

Think of the workflow builder as a factory floor blueprint. Every time you render, you're asking: _given what we know right now, what does the floor plan look like?_

The workflow builder returned by `createSmithers(...)` is a pure function of the current context:

```tsx
const workflow = smithers((ctx) => (
  <Workflow name="review-loop">
    <Task id="analyze" output={outputs.analysis}>
      {{ summary: `Analysis of ${ctx.input.topic}` }}
    </Task>

    {ctx.outputMaybe(outputs.analysis, { nodeId: "analyze" }) ? (
      <Task id="review" output={outputs.review}>
        {{ approved: true, feedback: "LGTM" }}
      </Task>
    ) : null}
  </Workflow>
));
```

Notice what's happening. On the first render, `analyze` hasn't run yet, so `outputMaybe` returns nothing, and the `review` task doesn't appear in the frame at all. After `analyze` finishes and its output is persisted, the next render sees that output and mounts `review`.

At render time Smithers does not execute tasks. It only builds a frame that reflects:

- the current `ctx.input`
- previously persisted outputs
- current loop iteration state
- mounted control-flow nodes such as [`<Branch>`](/components/branch), [`<Loop>`](/components/loop), and [`<Approval>`](/components/approval)

This is why normal JSX conditions work: each render sees the latest persisted state. No special conditional API, no "if-node." Just JavaScript.

## Phase 2: Extract Task Descriptors

You have a rendered tree. Now what?

Smithers walks that tree and extracts `TaskDescriptor` records -- flat, serializable descriptions of each piece of work. Think of it as reading the blueprint and writing work orders.

Each descriptor includes things like:

- `nodeId`
- output destination
- task kind: agent, compute, or static
- retry and timeout policy
- dependency metadata
- loop iteration
- concurrency-group metadata from [`<Parallel>`](/components/parallel) or [`<MergeQueue>`](/components/merge-queue)
- worktree metadata from [`<Worktree>`](/components/worktree)

Control-flow components don't execute anything themselves. They shape what gets extracted:

- `<Sequence>` contributes ordering
- `<Parallel>` contributes a concurrency group
- `<Branch>` mounts exactly one branch for the current frame
- `<Loop>` remounts its children for the active iteration
- `skipIf` removes nodes entirely from the current frame

"Wait," you might be thinking, "if `<Branch>` only mounts one branch, what happened to the other branches?" They simply aren't in the frame. They don't exist as far as the scheduler is concerned. If the data changes and a different branch should be active, the next render will mount it.

## Phase 3: Schedule Ready Tasks

Now the scheduler looks at the descriptors and asks a simple question for each one: _can this run right now?_

Typical task states are:

| State | Meaning |
| --- | --- |
| `pending` | Known to the workflow but not yet runnable |
| `in-progress` | Currently executing |
| `finished` | Persisted successfully |
| `failed` | Terminal failure after retries are exhausted |
| [`waiting-approval`](/concepts/approvals) | Blocked on a durable approval decision |
| [`waiting-event`](/components/wait-for-event) | Blocked on an external signal or event |
| [`waiting-timer`](/components/timer) | Suspended until a durable timer fires |
| `cancelled` | Interrupted by an abort signal or explicit handoff |
| `skipped` | Not mounted or intentionally bypassed |

A task becomes runnable only when:

- all sequential/dependency constraints are satisfied
- any required approval has been granted
- the current branch or loop iteration has mounted it
- its concurrency group allows another slot

There's no priority queue, no weight heuristic, no topological sort happening behind your back. A task is either ready or it isn't, and you can tell which by looking at the rendered frame and the persisted state. That's it.

## Phase 4: Execute Tasks

Smithers supports three task execution modes. The simplest first:

### Static tasks

When the children are plain data and no `agent` is present, Smithers writes that payload directly. No computation, no network call. This is useful for seeding known values into the workflow.

### Compute tasks

When the children are a function and no `agent` is present, Smithers runs that callback at execution time and persists the returned value. Your function, your logic, deterministic output.

### Agent tasks

When `agent` is present, Smithers renders the children to markdown, sends that prompt to the agent, validates the returned JSON, and persists the validated row. This is where LLMs enter the picture.

Across all three modes, the runtime applies the same operational policies:

- timeout handling
- retries
- `continueOnFail`
- [caching](/concepts/caching)
- [approval waits](/concepts/approvals)
- [event emission](/runtime/events)

The uniformity is deliberate. Whether a task shells out to GPT-4 or returns a hardcoded object, it goes through the same persist-and-resume machinery. You don't need to think about which kind of task you're writing when you think about durability.

## Phase 5: Persist Durable State

Here's the aha moment.

After each task attempt, Smithers writes durable records to SQLite. Not "optionally." Not "if you configure a backend." Every time, unconditionally.

That includes:

- the validated task output row
- attempt metadata
- node state transitions
- render frames
- lifecycle events
- approval decisions
- cache entries when enabled

For schema-driven outputs, the durable identity is effectively:

```txt
(runId, nodeId, iteration)
```

That triple is what makes everything else work. Resume, replay, crash recovery -- they all reduce to "read the rows keyed by `(runId, nodeId, iteration)` and render again." On the next render, [`ctx.outputMaybe(...)`](/concepts/workflow-state) and [`ctx.latest(...)`](/concepts/workflow-state) read from persisted rows, not from in-memory task objects.

The database isn't a log you consult after the fact. It's the control plane.

## Re-rendering Is the Control Plane

This is worth saying twice, because it's the single idea that makes the rest of the system simple.

Smithers does not mutate a long-lived in-memory graph after each task. Instead, it re-renders with the latest persisted context. The new frame might look different from the old one -- a branch might activate, a loop might advance, a conditional task might appear for the first time.

This is what enables:

- dynamic branching based on completed outputs
- iterative loops that stop when a reviewer approves
- conditional task visibility
- [hot reload](/guides/hot-reload) for future work without restarting the run

In other words, the rendered frame _is_ the current execution plan. There is no separate "plan" data structure that drifts out of sync with reality. The plan is regenerated from truth on every cycle.

## Resume Semantics

When you run with `resume: true`, Smithers:

1. loads the prior run metadata
2. reloads persisted outputs and internal state
3. renders the workflow again with that state available in `ctx`
4. skips tasks that already have valid output rows
5. continues from the first unfinished mounted work

No special resume logic. The same render-execute-persist loop runs; it just starts with a non-empty database. Tasks that already have durable output rows produce the same frame as before, so the scheduler skips right past them.

That only works if task identities stay stable. Renaming `id="review"` to `id="review-step"` creates a new durable node from the runtime's perspective. The old `review` output is still in the database, but nothing references it anymore. Be as creative as you like with control flow; be boring with your IDs.

## Determinism

Where does determinism come from? Not from restricting what you can do. From making the inputs explicit:

- stable `id` props identify tasks durably
- JSX structure defines ordering and control flow
- persisted rows define what has already happened
- schema validation prevents malformed outputs from entering durable state

The workflow can still be dynamic -- branches, loops, conditions, all of it. But the dynamism comes from data that is visible in the render context and can therefore be replayed after a crash or restart. If you can see it in `ctx`, the system can reproduce it.

## Mental Model

When in doubt, come back to this:

- JSX decides what the current workflow frame looks like
- SQLite decides what has already happened
- the scheduler only runs work that is both mounted and unblocked

Three sentences. That is the Smithers execution model: render, execute, persist, render again.

## Next Steps

- [Workflow State](/concepts/workflow-state) -- See how `ctx` exposes persisted outputs to each render.
- [Render Frame](/runtime/render-frame) -- Inspect the frame structure Smithers builds from JSX.
- [Data Model](/concepts/data-model) -- See how input, output tables, and internal metadata are stored.
- [Suspend and Resume](/concepts/suspend-and-resume) -- Understand crash recovery and durable replay.
- [runWorkflow](/runtime/run-workflow) -- Programmatic entry point for executing a workflow.
- [Planner Internals](/concepts/planner-internals) -- Lower-level details on extraction and scheduling.

---

## Workflow State

> How data flows between tasks, the ctx API, and the distinction between step outputs and shared workflow state.
> Source: https://smithers.sh/concepts/workflow-state

Most workflow engines give you a shared state bag. Every task reads from it, writes to it, and hopes nobody else clobbered their key in the meantime. You've seen this movie before -- it ends with race conditions and debugging sessions at 2 AM.

Smithers doesn't have shared mutable state. There is no global bag. Each task writes one typed output to SQLite, and downstream tasks read those outputs through `ctx`. That's the whole model.

Let's see what that looks like.

## How Data Flows Between Tasks

Forget function pipelines where return values pass hand-to-hand. In Smithers, tasks communicate through persisted outputs and re-renders:

```tsx
export default smithers((ctx) => {
  const analysis = ctx.outputMaybe(outputs.analysis, { nodeId: "analyze" });

  return (
    <Workflow name="pipeline">
      <Task id="analyze" output={outputs.analysis} agent={analyst}>
        {`Analyze ${ctx.input.repo}`}
      </Task>

      {analysis ? (
        <Task id="fix" output={outputs.fix} agent={coder}>
          {`Fix these issues: ${analysis.issues.join(", ")}`}
        </Task>
      ) : null}
    </Workflow>
  );
});
```

Read that again slowly. On the first render, `analysis` is `undefined`, so only `analyze` mounts. It runs, its output is persisted to SQLite, and then the tree re-renders. Now `analysis` has a value, `fix` mounts, and the workflow moves forward.

The flow is:

1. First render: `analysis` is `undefined`, only `analyze` is mounted
2. `analyze` runs, output is persisted to SQLite
3. Second render: `analysis` has a value, `fix` is mounted
4. `fix` runs using `analysis` data in its prompt

You might be wondering: "Why not just pass the return value directly?" Because persisted outputs buy you something return values can't -- durability. If the process crashes between step 1 and step 3, the output is already in SQLite. On restart, the second render picks up right where it left off.

## The Context API

The `ctx` object gives you three ways to read outputs. Each exists for a reason.

### `ctx.output(schema, { nodeId })`

Returns the output or **throws** if it doesn't exist yet. Use it when you *know* the upstream task has completed -- inside a `<Sequence>`, for instance, where ordering is guaranteed:

```tsx
// Safe — "analyze" always completes before "report" in a Sequence
<Sequence>
  <Task id="analyze" output={outputs.analysis} agent={analyst}>...</Task>
  <Task id="report" output={outputs.report}>
    {{ summary: ctx.output(outputs.analysis, { nodeId: "analyze" }).summary }}
  </Task>
</Sequence>
```

No uncertainty here. The `<Sequence>` guarantees `analyze` finishes first, so `ctx.output` will always find data.

### `ctx.outputMaybe(schema, { nodeId })`

Returns the output or `undefined`. This is the one you reach for when you're conditionally rendering -- when the answer to "has this task run yet?" controls what mounts next:

```tsx
const analysis = ctx.outputMaybe(outputs.analysis, { nodeId: "analyze" });

// Only mount "fix" if "analyze" has produced output
{analysis ? (
  <Task id="fix" output={outputs.fix} agent={coder}>
    {`Fix: ${analysis.issues.join(", ")}`}
  </Task>
) : null}
```

### `ctx.latest(schema, nodeId)`

Returns the most recent output **across all loop iterations**. Without this, every iteration would see only its own output, and you'd have no way to feed one iteration's result into the next:

```tsx
const latestReview = ctx.latest("review", "review");

<Loop until={latestReview?.approved === true} maxIterations={5}>
  <Task id="review" output={outputs.review} agent={reviewer}>
    {`Review the code. Previous feedback: ${latestReview?.feedback ?? "none"}`}
  </Task>
</Loop>
```

### When to Use Each

| Method | Returns | Throws? | Use case |
| --- | --- | --- | --- |
| `ctx.output()` | `T` | Yes, if missing | Inside sequential blocks where the upstream task is guaranteed to exist |
| `ctx.outputMaybe()` | `T \| undefined` | No | Conditional rendering, gating downstream tasks |
| `ctx.latest()` | `T \| undefined` | No | Inside loops, to read the most recent iteration's output |

The pattern: if you're certain the data exists, use `output`. If you're branching on whether it exists, use `outputMaybe`. If you're looping, use `latest`.

### `ctx.latestArray(value, schema)`

Parses a value as a JSON array and validates each element against a Zod schema. Invalid items are silently dropped. This is useful when an agent returns a JSON string containing an array, and you want type-safe, validated elements:

```tsx
const items = ctx.latestArray(
  ctx.latest("findings", "scan")?.items,
  z.object({ severity: z.string(), message: z.string() }),
);

// items: Array<{ severity: string; message: string }>
// Malformed entries are filtered out automatically
```

If `value` is a string, it is JSON-parsed first. If it is already an array, each element is validated directly. Non-array values are wrapped in a single-element array before validation.

### Table Name Resolution

The `table` parameter in `ctx.output()`, `ctx.outputMaybe()`, `ctx.latest()`, and `ctx.iterationCount()` accepts three forms:

| Form | Example | Resolution |
| --- | --- | --- |
| String schema key | `"analysis"` | Looked up directly in the outputs snapshot |
| Zod schema | `outputs.analysis` | Resolved via the `createSmithers()` schema registry to its key name |
| Drizzle table | `analysisTable` | Resolved via `getTableName()` to the SQL table name |

The recommended form is the Zod schema from `outputs`, which provides type inference. String keys work when you need dynamic lookups.

### `ctx.auth`

The authentication context passed via `RunOptions.auth`. Returns `null` when no auth context was configured:

```tsx
export default smithers((ctx) => (
  <Workflow name="gated">
    <Task id="deploy" output={outputs.deploy} agent={deployer} skipIf={ctx.auth?.role !== "admin"}>
      {`Deploy as ${ctx.auth?.userId}`}
    </Task>
  </Workflow>
));
```

## Workflow Input

`ctx.input` holds the workflow's input, validated against its schema before execution begins:

```tsx
export default smithers((ctx) => (
  <Workflow name="deploy">
    <Task id="build" output={outputs.build}>
      {{ target: ctx.input.environment }}
    </Task>
  </Workflow>
));
```

Once a run starts, the input is immutable and persisted. Passing different input on resume is an error. This isn't a limitation -- it's a guarantee. You can always trust that `ctx.input` is the same value that started the run.

### Input Payload Unwrapping

When input is stored in a table with only `runId` and `payload` columns (the default `_smithers_input` table), Smithers automatically unwraps the `payload` field. If `payload` is a JSON string, it is parsed. This means `ctx.input` always gives you the clean, deserialized input object -- never the raw database row with `runId` and `payload` wrappers.

## Step Outputs vs Shared State

Here's the key insight: in Smithers, **outputs are the state**. There is no separate "workflow state" object that tasks read from and write to. The rendered JSX tree plus the persisted outputs together *are* the workflow state.

### In Smithers: Outputs are the state

Each task produces a typed, validated output. That output is the state. Think of it like a database where every task owns its own table, rather than a whiteboard where everyone scribbles in the same corner.

This has important consequences:

- **No race conditions** -- Tasks don't compete to update a shared store. Each task writes to its own output table.
- **Natural type safety** -- Each output has its own Zod schema. There's no untyped global bag.
- **Resumability** -- Because each output is persisted independently, crash recovery is straightforward.

### Data sharing across steps

"But what if two tasks need the same data?" They both read from the same upstream output. No copying, no shared store, no coordination:

```tsx
const analysis = ctx.outputMaybe(outputs.analysis, { nodeId: "analyze" });

<Parallel>
  <Task id="fix" output={outputs.fix} agent={coder}>
    {`Fix: ${analysis?.issues.join(", ")}`}
  </Task>
  <Task id="report" output={outputs.report} agent={writer}>
    {`Summarize: ${analysis?.summary}`}
  </Task>
</Parallel>
```

Both `fix` and `report` read from the same `analysis` output. No shared mutable state is needed.

## Iteration State

Inside a `<Loop>`, each iteration produces separate output rows keyed by `(runId, nodeId, iteration)`. This means:

- Iteration 0's review and iteration 1's review are stored as separate rows
- `ctx.latest()` finds the highest iteration number
- `ctx.iteration` gives you the current iteration (0-indexed)
- `ctx.iterationCount(schema, nodeId)` tells you how many iterations have completed

```tsx
const latestDraft = ctx.latest("draft", "write");
const latestReview = ctx.latest("review", "review");

<Loop until={latestReview?.approved} maxIterations={5}>
  <Sequence>
    <Task id="write" output={outputs.draft} agent={writer}>
      {latestReview
        ? `Revise based on feedback: ${latestReview.feedback}`
        : `Write about: ${ctx.input.topic}`}
    </Task>
    <Task id="review" output={outputs.review} agent={reviewer}>
      {`Review: ${latestDraft?.text}`}
    </Task>
  </Sequence>
</Loop>
```

Notice how `ctx.latest()` is doing the heavy lifting. On iteration 0, `latestReview` is `undefined`, so the writer gets the original topic. On iteration 1, `latestReview` has feedback from the first review, so the writer revises. Each iteration builds on the last, and you didn't have to manage any of that bookkeeping yourself.

## Persistence

All task outputs are persisted to SQLite immediately on completion. This is what makes workflows durable -- not your code, not a try/catch, just the fact that every output hits disk before the next task starts.

| What | Where | Keyed by |
| --- | --- | --- |
| Workflow input | `_smithers_runs` | `runId` |
| Task output | User-defined table | `(runId, nodeId)` or `(runId, nodeId, iteration)` |
| Execution metadata | `_smithers_nodes`, `_smithers_attempts` | Internal keys |

You don't need to manage persistence yourself. Smithers handles it as part of the execution loop.

## The Re-render Cycle

Here's where it all comes together. Smithers re-renders the JSX tree after each task completes. This is how data-dependent control flow works without imperative `if` statements or state machines:

1. **Render 1**: `ctx.outputMaybe("analysis", ...)` returns `undefined` -- only `analyze` is mounted
2. `analyze` completes -- output persisted
3. **Render 2**: `ctx.outputMaybe("analysis", ...)` returns the analysis -- `fix` is mounted
4. `fix` completes -- output persisted
5. **Render 3**: all tasks complete -- workflow finishes

This cycle is automatic. You write declarative JSX; the render loop drives execution forward. The tree is a function of the persisted outputs, and the persisted outputs are a function of which tasks have run. One feeds the other until there's nothing left to do.

## Next Steps

- [Control Flow](/concepts/control-flow) -- The four primitives that determine execution order.
- [Data Model](/concepts/data-model) -- How Schema, Model, and metadata fit together at the persistence layer.
- [Suspend and Resume](/concepts/suspend-and-resume) -- How state survives crashes and approval gates.

---

## Control Flow

> How to sequence, parallelize, branch, and loop tasks in Smithers workflows.
> Source: https://smithers.sh/concepts/control-flow

Four primitives. That's the whole toolkit.

[`<Sequence>`](/components/sequence), [`<Parallel>`](/components/parallel), [`<Branch>`](/components/branch), [`<Loop>`](/components/loop) -- these are the only control-flow components you need to wire together any workflow. They compose like building blocks: nest them, combine them, and the [execution graph](/concepts/execution-model) writes itself.

Let's build up from the simplest case.

## Sequential Execution: `<Sequence>`

You have three [tasks](/components/task). Each one needs the previous one's result. This is the most common pattern in programming, and `<Sequence>` does exactly what you'd expect: run children top to bottom, one at a time.

```tsx
<Workflow name="pipeline">
  <Sequence>
    <Task id="fetch" output={outputs.fetch}>
      {{ url: "https://api.example.com" }}
    </Task>
    <Task id="transform" output={outputs.transform} agent={transformer}>
      {`Transform: ${ctx.output(outputs.fetch, { nodeId: "fetch" }).url}`}
    </Task>
    <Task id="store" output={outputs.store}>
      {{ stored: true }}
    </Task>
  </Sequence>
</Workflow>
```

`fetch` runs first. Only after it completes does `transform` start. `store` runs last.

> **Tip:** [`<Workflow>`](/components/workflow) already sequences its direct children implicitly. You only need an explicit `<Sequence>` when nesting sequential groups inside [`<Parallel>`](/components/parallel), [`<Branch>`](/components/branch), or [`<Loop>`](/components/loop).


So if `<Workflow>` already sequences things, why does `<Sequence>` exist at all? Because you'll want to put ordered steps *inside* the other primitives. You'll see this in a moment.

## Parallel Execution: `<Parallel>`

Now suppose you're running a CI pipeline. Linting, type-checking, and tests don't depend on each other. Why run them one at a time?

```tsx
<Workflow name="checks">
  <Parallel>
    <Task id="lint" output={outputs.lint}>{{ errors: 0 }}</Task>
    <Task id="typecheck" output={outputs.typecheck}>{{ passed: true }}</Task>
    <Task id="test" output={outputs.test}>{{ passed: true }}</Task>
  </Parallel>
</Workflow>
```

All three tasks start simultaneously. The parallel group completes when **all** children have finished.

### Limiting Concurrency

What if you're calling an API with a rate limit of two concurrent requests? You don't want four agent calls hammering it at once.

Use `maxConcurrency` to cap it:

```tsx
<Parallel maxConcurrency={2}>
  <Task id="repo-1" output={outputs.repo1} agent={analyst}>Analyze alpha.</Task>
  <Task id="repo-2" output={outputs.repo2} agent={analyst}>Analyze beta.</Task>
  <Task id="repo-3" output={outputs.repo3} agent={analyst}>Analyze gamma.</Task>
  <Task id="repo-4" output={outputs.repo4} agent={analyst}>Analyze delta.</Task>
</Parallel>
```

At most two agent calls run at the same time. As each completes, the next queued task starts.

### Combining Parallel and Sequential

Here's where composition gets interesting. Remember the question about why `<Sequence>` exists? This is the answer:

```tsx
<Workflow name="ci">
  <Parallel>
    <Sequence>
      <Task id="build-web" output={outputs.buildWeb}>{{ ok: true }}</Task>
      <Task id="deploy-web" output={outputs.deployWeb}>{{ ok: true }}</Task>
    </Sequence>
    <Sequence>
      <Task id="build-api" output={outputs.buildApi}>{{ ok: true }}</Task>
      <Task id="deploy-api" output={outputs.deployApi}>{{ ok: true }}</Task>
    </Sequence>
  </Parallel>
</Workflow>
```

The two sequences run in parallel. Within each sequence, tasks run one at a time. `deploy-web` waits for `build-web`, but `build-api` does not wait for `build-web`.

Two pipelines. Running side by side. Each internally ordered. That's a CI matrix in six lines of JSX.

## Conditional Logic: `<Branch>`

Tests passed? Deploy. Tests failed? Notify the team. You've written this `if/else` a thousand times. `<Branch>` makes it declarative:

```tsx
<Workflow name="deploy-pipeline">
  <Task id="test" output={outputs.test}>{{ passed: true, error: null }}</Task>

  <Branch
    if={ctx.output(outputs.test, { nodeId: "test" }).passed}
    then={
      <Task id="deploy" output={outputs.deploy}>
        {{ url: "https://prod.example.com" }}
      </Task>
    }
    else={
      <Task id="notify" output={outputs.notify}>
        {{ message: "Tests failed, skipping deploy." }}
      </Task>
    }
  />
</Workflow>
```

Only the selected branch is mounted. The other branch's tasks do not exist in the execution plan. This isn't short-circuit evaluation -- it's structural. The losing branch is never part of the graph.

### Branching into Complex Sub-graphs

Each branch can contain any workflow element -- not just a single task. A critical bug might need a hotfix *and* an emergency deploy. A minor bug just goes to the backlog:

```tsx
<Branch
  if={severity === "critical"}
  then={
    <Sequence>
      <Task id="hotfix" output={outputs.hotfix} agent={coder}>
        Write a hotfix for the critical issue.
      </Task>
      <Task id="emergency-deploy" output={outputs.deploy}>{{ deployed: true }}</Task>
    </Sequence>
  }
  else={
    <Task id="backlog" output={outputs.backlog}>{{ queued: true }}</Task>
  }
/>
```

### JSX Conditions

Because Smithers re-renders the tree each [frame](/runtime/render-frame), you can also branch with plain [JSX](/jsx/overview) conditions:

```tsx
{analysis?.hasIssues ? (
  <Task id="fix" output={outputs.fix} agent={coder}>Fix the issues.</Task>
) : null}
```

When do you reach for `<Branch>` vs a ternary? Use `<Branch>` when you want both paths explicitly declared in the graph -- it documents the fork. Use JSX conditions for simpler gating on whether a task should exist at all.

## Looping: `<Loop>`

Some work isn't done until it's done. You write a draft, get feedback, revise, get more feedback. This is the pattern `<Loop>` is built for:

```tsx
<Loop
  until={ctx.outputMaybe(outputs.review, { nodeId: "review" })?.approved === true}
  maxIterations={5}
  onMaxReached="return-last"
>
  <Sequence>
    <Task id="write" output={outputs.draft} agent={writer}>
      Write a draft.
    </Task>
    <Task id="review" output={outputs.review} agent={reviewer}>
      Review the draft.
    </Task>
  </Sequence>
</Loop>
```

Each iteration:

1. The `until` condition is evaluated at render time
2. If `false`, the loop body runs again
3. Outputs are persisted per-iteration (keyed by `iteration` column)
4. The tree re-renders with updated context
5. The `until` condition is re-evaluated

The loop stops when `until` returns `true` or when `maxIterations` is hit -- whichever comes first. Without `maxIterations`, a stubborn reviewer could keep you looping forever.

### Accessing Previous Iteration Output

The interesting question: how does iteration N+1 know what iteration N produced? Use [`ctx.latest()`](/concepts/workflow-state) to feed the previous iteration's output back into the next:

```tsx
const latestReview = ctx.latest("review", "review");
const latestDraft = ctx.latest("draft", "write");

<Loop until={latestReview?.approved === true} maxIterations={5}>
  <Sequence>
    <Task id="write" output={outputs.draft} agent={writer}>
      {latestReview
        ? `Improve the draft. Feedback: ${latestReview.feedback}`
        : `Write a first draft about: ${ctx.input.topic}`}
    </Task>
    <Task id="review" output={outputs.review} agent={reviewer}>
      {`Review this draft:\n${latestDraft?.text ?? ""}`}
    </Task>
  </Sequence>
</Loop>
```

On the first iteration, `latestReview` is `undefined`, so the writer gets the original topic. On every subsequent iteration, the writer gets the reviewer's feedback. This is how iterative refinement works: each pass incorporates what the previous pass learned.

### Max Iterations

| `onMaxReached` | Behavior |
| --- | --- |
| `"return-last"` | Stop looping, keep the final iteration's output, continue the workflow. This is the default. |
| `"fail"` | Stop looping and fail the workflow. |

## Choosing the Right Pattern

You have four primitives and you've seen them individually. Now the question is: which one do I reach for?

### Quick Reference

| Primitive | Purpose | Use when... |
| --- | --- | --- |
| `<Sequence>` | Run tasks in order | Each step depends on the previous step's completion |
| `<Parallel>` | Run tasks concurrently | Tasks are independent and can run at the same time |
| `<Branch>` | Choose one path | The next step depends on a runtime condition |
| `<Loop>` | Repeat until done | Work needs iterative refinement (implement -> review -> fix) |

### `<Parallel>` vs Dynamic Tasks

This trips people up. When do you use `<Parallel>` and when do you use `.map()`?

Use `<Parallel>` when you have a **fixed set of different operations** on the same data:

```tsx
// Three different reviewers, each doing different work
<Parallel>
  <Task id="security-review" agent={securityReviewer}>...</Task>
  <Task id="perf-review" agent={perfReviewer}>...</Task>
  <Task id="style-review" agent={styleReviewer}>...</Task>
</Parallel>
```

Use **[dynamic JSX](/jsx/overview)** when you have a **variable list of items** that need the same operation:

```tsx
// Process each ticket the same way
{tickets.map((ticket) => (
  <Task key={ticket.id} id={`${ticket.id}:implement`} output={outputs.implement} agent={coder}>
    {`Implement: ${ticket.description}`}
  </Task>
))}
```

The distinction: `<Parallel>` is for heterogeneous fan-out (different work, same time). `.map()` is for homogeneous fan-out (same work, different data).

### Composition Patterns

| Pattern | What happens | Use case |
| --- | --- | --- |
| `<Sequence>` -> `<Sequence>` | Flat sequential chain | Simple pipelines |
| `<Parallel>` -> `<Task>` | Fan-out, then combine | Run parallel work, aggregate results |
| `<Loop>` -> `<Sequence>` | Iterative pipeline | Implement-review-fix cycles |
| `<Branch>` -> `<Sequence>` | Conditional multi-step | Different pipelines for different conditions |
| `<Parallel>` -> `<Sequence>` inside each | Parallel pipelines | Build + deploy web AND api simultaneously |

### Synchronization

Both `<Parallel>` and `<Loop>` are synchronization points. The next task after them only runs after all their children complete:

```tsx
<Workflow name="fan-out-fan-in">
  <Parallel>
    <Task id="a" ...>...</Task>
    <Task id="b" ...>...</Task>
    <Task id="c" ...>...</Task>
  </Parallel>
  {/* This only runs after a, b, AND c all finish */}
  <Task id="combine" ...>...</Task>
</Workflow>
```

This is fan-out/fan-in. The parallel block is a barrier. Nothing downstream proceeds until everything upstream has settled.

## Conditional Skipping

All control-flow components support `skipIf` to bypass them entirely:

```tsx
<Sequence skipIf={ctx.input.skipTests}>
  <Task id="unit-tests" output={outputs.unitTests}>{{ passed: true }}</Task>
  <Task id="e2e-tests" output={outputs.e2eTests}>{{ passed: true }}</Task>
</Sequence>
```

When `skipIf` is `true`, the component returns `null` and none of its children are mounted.

## Next Steps

- [Sequence](/components/sequence) -- Component API for ordered execution.
- [Parallel](/components/parallel) -- Component API for concurrency and `maxConcurrency`.
- [Branch](/components/branch) -- Component API for conditional paths.
- [Loop](/components/loop) -- Component API for iterative workflows.
- [Workflow State](/concepts/workflow-state) -- How outputs and `ctx.latest()` drive control flow.
- [Implement-Review Loop](/guides/review-loop) -- See these primitives in a production pattern.

---

## Agents and Tools

> How to use AI agents and built-in tools within Smithers workflow tasks.
> Source: https://smithers.sh/concepts/agents-and-tools

You have a workflow. Some of its tasks require judgment -- reading code, spotting bugs, drafting implementations. You could write heuristics for that, but heuristics are brittle and exhausting to maintain. What you really want is to drop an AI into a task and say "figure it out." That is what agents are for.

## Agents in Workflows

An agent handles the thinking inside a task. Give a [`<Task>`](/components/task) the `agent` prop and three things happen: the children become the prompt, the agent reasons and responds, and Smithers validates the response against the output schema. No ceremony required.

### Using an Agent

The simplest case first, using the [AI SDK](https://ai-sdk.dev) with [Anthropic](https://docs.anthropic.com):

```tsx
import { ToolLoopAgent as Agent } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
import { read, grep, bash } from "smithers-orchestrator";

const codeAgent = new Agent({
  model: anthropic("claude-sonnet-4-20250514"),
  instructions: "You are a senior software engineer.",
  tools: { read, grep, bash },
});

<Task id="analyze" output={outputs.analysis} agent={codeAgent}>
  {`Analyze the codebase in ${ctx.input.repo} and identify security issues.`}
</Task>
```

That is the whole pattern. The children render into a prompt string. The agent uses its tools to explore the codebase. Smithers captures the response and checks it against the `analysis` schema. If the response is valid, the task completes. If not -- well, we will get to that.

### Agent Types

Where does the AI actually run? Smithers gives you two options, and they are interchangeable.

**[SDK Agents](/integrations/sdk-agents)** talk directly to a provider API. You pay per token, you get fine-grained control:

```ts
import { ToolLoopAgent as Agent } from "ai";
import { anthropic } from "@ai-sdk/anthropic";

const claude = new Agent({
  model: anthropic("claude-sonnet-4-20250514"),
  tools: { read, write, edit, grep, bash },
});
```

**[CLI Agents](/integrations/cli-agents)** wrap external AI command-line tools. Same interface, different engine under the hood:

```ts
import { ClaudeCodeAgent, CodexAgent, GeminiAgent } from "smithers-orchestrator";

const claude = new ClaudeCodeAgent({ model: "claude-sonnet-4-20250514" });
const codex = new CodexAgent({ model: "gpt-4.1", fullAuto: true });
const gemini = new GeminiAgent({ model: "gemini-2.5-pro" });
```

Why does this matter? Because the `<Task>` does not care which kind you hand it:

```tsx
// Same syntax regardless of agent type
<Task id="review" output={outputs.review} agent={claude}>
  Review the code for issues.
</Task>
```

Swap `claude` for `codex` and the task works the same way. The interface is the seam; what sits behind it is your choice.

### Structured Output

Agents do not return free-form text. They return data, validated against a [Zod](https://zod.dev) schema. This is the contract that makes agents composable -- downstream tasks can depend on the shape of what comes back.

```tsx
const { outputs } = createSmithers({
  analysis: z.object({
    summary: z.string(),
    risk: z.enum(["low", "medium", "high"]),
    issues: z.array(z.object({
      file: z.string(),
      description: z.string(),
    })),
  }),
});

// The agent must return data matching the analysis schema
<Task id="analyze" output={outputs.analysis} agent={codeAgent} retries={2}>
  Analyze the codebase and report issues.
</Task>
```

What happens when an agent returns `{ summary: "...", risk: "critical" }`? Validation fails -- `"critical"` is not in the enum. Smithers feeds the Zod error back to the agent and retries. The agent sees its own mistake, corrects it, and tries again. Think of it as a compiler error for AI output.

### Agent Fallback Chains

Agents fail. Models go down, rate limits hit, responses come back garbled. You do not want your workflow to stop because one provider had a bad minute.

Pass an array of agents to create a fallback chain:

```tsx
<Task
  id="implement"
  output={outputs.implement}
  agent={[codex, claude]}
  retries={2}
>
  Implement the feature described in the ticket.
</Task>
```

First attempt uses `codex`. If it fails, `claude` takes over on retry. This is a practical pattern: start with the fast, cheap option; fall back to the more capable one.

For the common case of a single fallback, there is a dedicated prop:

```tsx
<Task
  id="implement"
  output={outputs.implement}
  agent={codex}
  fallbackAgent={claude}
  retries={1}
>
  Implement the feature.
</Task>
```

## Multi-Agent Patterns

One agent per task is the simple case. But some problems benefit from multiple perspectives or a division of labor.

### Parallel Review

Ask two agents the same question and compare answers. This is the "get a second opinion" pattern:

```tsx
<Parallel>
  <Task id="review-claude" output={outputs.review} agent={claude} continueOnFail>
    <ReviewPrompt code={code} reviewer="claude" />
  </Task>
  <Task id="review-codex" output={outputs.review} agent={codex} continueOnFail>
    <ReviewPrompt code={code} reviewer="codex" />
  </Task>
</Parallel>
```

The [`continueOnFail`](/guides/error-handling) prop is important here. If one reviewer times out or crashes, the other still completes. You get at least one review instead of zero.

### Pipeline Handoff

Different agents are good at different things. Let each one do what it does best:

```tsx
<Sequence>
  <Task id="implement" output={outputs.implement} agent={codex}>
    Write the implementation.
  </Task>
  <Task id="review" output={outputs.review} agent={claude}>
    {`Review this implementation: ${ctx.output(outputs.implement, { nodeId: "implement" }).summary}`}
  </Task>
</Sequence>
```

Codex writes the code. Claude reviews it. You would not ask the same person to write and review their own work -- same logic applies here.

## Tools

An agent without tools is a brain in a jar. It can reason about what you tell it, but it cannot look at your files, run your tests, or check what is on disk. Tools fix that.

Smithers provides five [built-in tools](/integrations/tools), each doing one thing well:

| Tool | Purpose | Input |
| --- | --- | --- |
| `read` | Read a file | `{ path }` |
| `write` | Write a file | `{ path, content }` |
| `edit` | Apply a unified diff patch | `{ path, patch }` |
| `grep` | Search files with regex | `{ pattern, path? }` |
| `bash` | Execute a shell command | `{ cmd, args?, opts? }` |

### Assigning Tools to Agents

Pass them in when you create the agent:

```ts
import { read, write, edit, grep, bash } from "smithers-orchestrator";

const coder = new Agent({
  model: anthropic("claude-sonnet-4-20250514"),
  tools: { read, write, edit, grep, bash },
});
```

Or grab them all at once:

```ts
import { tools } from "smithers-orchestrator";

const coder = new Agent({
  model: anthropic("claude-sonnet-4-20250514"),
  tools,
});
```

### Sandboxing

You might be wondering: "I am giving an AI shell access. How do I not lose sleep over this?"

All tools are sandboxed to `rootDir` (defaults to the workflow directory). The constraints are straightforward:

- File paths are resolved relative to the root
- Symlinks that escape the sandbox are rejected
- Output is truncated to `maxOutputBytes` (default 200KB)
- Shell commands have a 60-second timeout
- Network access is blocked by default in `bash`

The agent can explore and modify your project. It cannot escape the sandbox, phone home, or run indefinitely.

### Read-Only vs Full-Access Agents

Here is a question worth asking for every agent you create: does it actually need write access?

A reviewer does not need to modify files. A code generator does. Match the tools to the job:

```tsx
// Reviewer only needs to read — no write/edit access
const reviewer = new Agent({
  model: anthropic("claude-sonnet-4-20250514"),
  tools: { read, grep },
});

// Implementer needs full access
const coder = new Agent({
  model: anthropic("claude-sonnet-4-20250514"),
  tools: { read, write, edit, grep, bash },
});
```

Least privilege is not just a security principle. It is also a guardrail against expensive mistakes -- an agent that cannot write files cannot write bad files.

## Task Modes Without Agents

Not everything requires AI. Some tasks are deterministic. Some are just data. Smithers handles both without reaching for an agent.

### Compute Mode

When children is a function and there is no agent, the function runs directly at execution time:

```tsx
<Task id="validate" output={outputs.validation} timeoutMs={30000}>
  {async () => {
    const result = await $`bun test`.quiet();
    return { passed: result.exitCode === 0, output: result.text() };
  }}
</Task>
```

Use compute mode for things that have a right answer: running tests, calling APIs, transforming data. No AI needed, no tokens burned.

### Static Mode

When children is a plain value, Smithers writes it directly as output. No computation, no agent -- just data:

```tsx
<Task id="config" output={outputs.config}>
  {{ environment: "production", version: "2.1.0" }}
</Task>
```

Use static mode for constants, values computed from upstream outputs, or seeding data into later tasks.

## Choosing the Right Approach

When you are staring at a new task, ask: does this require judgment?

| Scenario | Approach |
| --- | --- |
| Need AI reasoning or generation | Agent mode with `agent` prop |
| Need to run shell commands or tests | Compute mode with async callback |
| Need to pass data between steps | Static mode with literal value |
| Need AI + file access | Agent mode with tools |
| Need resilient AI calls | Agent with `retries` and/or `fallbackAgent` |
| Need diverse AI perspectives | Parallel tasks with different agents |

If yes, use an agent. If no, use compute or static mode. If you are unsure, start without an agent -- you can always add one later.

## Next Steps

- [Built-in Tools](/integrations/tools) -- Full API reference for all five tools.
- [SDK Agents](/integrations/sdk-agents) -- Provider-backed agents and model configuration.
- [CLI Agents](/integrations/cli-agents) -- Using Claude Code, Codex, Gemini, and other CLI agents.
- [Structured Output](/guides/structured-output) -- How schema validation, retries, and repair prompts work.
- [Approvals](/concepts/approvals) -- Gate tool-using tasks behind human approval when needed.
- [Implement-Review Loop](/guides/review-loop) -- A production pattern using multi-agent review.

---

## Human-in-the-Loop

> How to pause workflows for human approval, handle denials, and design multi-step approval workflows.
> Source: https://smithers.sh/concepts/human-in-the-loop

You automate a workflow. It builds, it tests, it deploys. Then one Tuesday it deploys a broken migration to production at 3 AM because no human was awake to say "wait." Full automation is wonderful right up until it isn't.

The tension is real: you want machines to do the boring parts, but some decisions — deploying to production, publishing to customers, spending money — need a human brain in the loop. The question isn't whether to pause. It's how to pause *well*.

Smithers gives you two mechanisms, and the difference between them matters more than you might think.

## Simple Gates: `needsApproval`

Start with the simplest thing that could work. If you just need a human to say "go" before a task runs, put `needsApproval` on it:

```tsx
<Task id="deploy" output={outputs.deploy} agent={deployer} needsApproval>
  Deploy the application to production.
</Task>
```

The workflow pauses before executing the task. A human approves. The task runs. That's it — no decision value, no downstream branching. A gate, nothing more.

```bash
# Approve the gate
smithers approve abc123 --node deploy

# Resume the workflow
smithers up workflow.tsx --run-id abc123 --resume true
```

Use `needsApproval` when you need a checkpoint, not a choice. "Should we proceed?" is a gate. "What should we do next?" is not.

## Explicit Nodes: `<Approval>`

Here's where it gets interesting. What if a denial isn't an error — it's information? What if "no" means "take a different path" rather than "stop everything"?

That's the aha moment: **an approval can be data, not just a gate**.

When downstream tasks need to *read* the decision and branch on it, use [`<Approval>`](/components/approval):

```tsx
import { Approval, approvalDecisionSchema, createSmithers } from "smithers-orchestrator";

const { Workflow, smithers, outputs } = createSmithers({
  approval: approvalDecisionSchema,
  result: z.object({ status: z.enum(["published", "rejected"]) }),
});

export default smithers((ctx) => {
  const decision = ctx.outputMaybe(outputs.approval, { nodeId: "approve-publish" });

  return (
    <Workflow name="publish-flow">
      <Sequence>
        <Approval
          id="approve-publish"
          output={outputs.approval}
          request={{
            title: "Publish the draft?",
            summary: "Human review required before production publish.",
          }}
          onDeny="continue"
        />

        {decision ? (
          <Task id="record" output={outputs.result}>
            {{ status: decision.approved ? "published" : "rejected" }}
          </Task>
        ) : null}
      </Sequence>
    </Workflow>
  );
});
```

Look at the `onDeny="continue"` and the [branching logic](/concepts/control-flow) that follows. A denial doesn't kill the workflow — it flows through as a value. The downstream task reads `decision.approved` and acts accordingly.

The `<Approval>` node produces an `ApprovalDecision`:

```ts
type ApprovalDecision = {
  approved: boolean;
  note: string | null;
  decidedBy: string | null;
  decidedAt: string | null;
};
```

Four fields. The boolean is the verdict. The rest is audit trail. Every approval becomes a first-class piece of data in your workflow — queryable, branchable, persistent.

Smithers keeps the authoritative decision timestamp in its approval records and event history so durable outputs stay deterministic across [resume](/concepts/suspend-and-resume) and [replay](/concepts/time-travel).

## Denial Policies

So a human said "no." Now what? The `onDeny` prop answers that question, and the right answer depends entirely on what you're protecting.

### `onDeny="fail"` (default)

The workflow fails. Full stop. Use this for actions where denial means "this should not have been attempted":

```tsx
<Approval
  id="deploy-prod"
  output={outputs.deployApproval}
  request={{ title: "Deploy to production?" }}
  onDeny="fail"
/>
```

Production deploys, compliance-sensitive operations, anything destructive — if a human says no, you want the workflow to stop, not find a creative workaround.

### `onDeny="continue"`

The denial resolves as a decision value and the workflow keeps going. This is the policy that turns approvals into branching logic:

```tsx
<Approval
  id="approve-publish"
  output={outputs.approval}
  request={{ title: "Publish?" }}
  onDeny="continue"
/>

{decision?.approved ? (
  <Task id="publish" output={outputs.publish}>...</Task>
) : decision ? (
  <Task id="log-rejection" output={outputs.rejection}>
    {{ reason: decision.note ?? "No reason given" }}
  </Task>
) : null}
```

Notice the three-way branch: approved, denied, or not yet decided. The denied path does real work — logging the rejection, notifying someone, taking an alternative action. "No" is just another kind of data.

### `onDeny="skip"`

The protected branch is skipped, but the rest of the workflow continues as if the approval node were never there:

```tsx
<Approval
  id="optional-review"
  output={outputs.reviewApproval}
  request={{ title: "Run additional review?" }}
  onDeny="skip"
/>
```

Think of this as "nice to have" approval. The extra review would be great, but the workflow doesn't depend on it.

### Choosing a Denial Policy

| Policy | Use when... |
| --- | --- |
| `"fail"` | Denial should stop the entire workflow (deploys, releases) |
| `"continue"` | You need to branch on the decision (publish vs reject) |
| `"skip"` | The approved work is optional and the workflow should continue without it |

If you aren't sure, start with `"fail"`. It's the safest default, and you can always loosen it later. You cannot un-deploy to production.

## The Approval Lifecycle

What actually happens when the workflow hits an approval node? Here's the full lifecycle, and every state is durable:

```
pending → requested → waiting-approval → approved | denied → completed
```

1. Smithers reaches the approval node
2. It persists an approval request record (title, summary, metadata)
3. The workflow [suspends in a durable waiting state](/concepts/suspend-and-resume)
4. A human approves or denies via CLI, API, or UI
5. The node resolves according to its denial policy
6. Downstream tasks become eligible to run

"Durable" is the key word. If the process crashes while waiting for approval, the request still exists. When the process restarts and a decision arrives, the workflow picks up exactly where it left off. Your approval doesn't vanish just because a server rebooted — that would rather defeat the purpose.

## Multi-Step Approvals

Real pipelines often need more than one checkpoint. Each `<Approval>` is independent — they don't know about each other, and they don't need to:

```tsx
<Workflow name="release-pipeline">
  <Sequence>
    <Task id="build" output={outputs.build} agent={builder}>Build the release.</Task>

    <Approval
      id="qa-approval"
      output={outputs.qaApproval}
      request={{ title: "QA sign-off", summary: "All tests passed." }}
    />

    <Task id="stage" output={outputs.stage} agent={deployer}>Deploy to staging.</Task>

    <Approval
      id="prod-approval"
      output={outputs.prodApproval}
      request={{ title: "Production deploy", summary: "Staging looks good." }}
    />

    <Task id="deploy" output={outputs.deploy} agent={deployer}>Deploy to production.</Task>
  </Sequence>
</Workflow>
```

Build, wait for QA, stage, wait for production sign-off, deploy. Each gate pauses independently. The workflow advances step-by-step as humans approve:

```bash
# After build completes, approve QA
smithers approve rel-1 --node qa-approval --note "QA passed"
smithers up workflow.tsx --run-id rel-1 --resume true

# After staging, approve production
smithers approve rel-1 --node prod-approval --note "Ship it"
smithers up workflow.tsx --run-id rel-1 --resume true
```

This is a pattern you'll see in any organization with a release process. The workflow encodes the process; the humans provide judgment at the right moments.

## Approval with Context

A bare "Deploy to production?" is not very helpful when you're the one being asked at 11 PM. Give your approvers what they need to decide:

```tsx
<Approval
  id="deploy-approval"
  output={outputs.deployApproval}
  request={{
    title: `Deploy v${version} to production?`,
    summary: `${passedTests} tests passed. Risk level: ${riskLevel}.`,
    metadata: {
      commitSha: ctx.input.commitSha,
      changedFiles: analysis.filesChanged,
    },
  }}
/>
```

The `title` and `summary` are what the human sees. The `metadata` is persisted alongside the request and available in `smithers inspect`, so the approver can check the commit, review which files changed, and make an informed decision without switching tools.

Good approval context is the difference between "I guess?" and "Yes, ship it." Invest in it.

## `needsApproval` vs `<Approval>`: When to Use Each

| Feature | `needsApproval` | `<Approval>` |
| --- | --- | --- |
| Produces a decision value | No | Yes (`ApprovalDecision`) |
| Downstream branching | Not possible | Branch on `decision.approved` |
| Denial policies | Implicit fail | `"fail"`, `"continue"`, or `"skip"` |
| Custom request metadata | No | Yes (`title`, `summary`, `metadata`) |
| Visible in graph | As a flag on the task | As its own node |

**Rule of thumb:** If you're thinking of the approval as a padlock on a task, use `needsApproval`. If you're thinking of it as a fork in the road, use `<Approval>`.

## Structured Human Input: `<HumanTask>`

Sometimes you don't need a yes/no — you need the human to *provide data*. A code review, a triage classification, a budget estimate. [`<HumanTask>`](/components/human-task) is a task where the human is the agent: the workflow suspends until a human provides JSON matching the output schema, with validation and retries.

```tsx
<HumanTask
  id="human-review"
  output={outputs.review}
  prompt="Review the PR. Provide: approved (boolean), comments (string), severity (low/medium/high)."
  maxAttempts={5}
  timeoutMs={86_400_000}
/>
```

If the human provides invalid JSON, the task retries — up to `maxAttempts` (default 10). Schema validation happens at compute time, not at submission time. See the [HumanTask component reference](/components/human-task) for the full API.

## Conditional Gates: `<ApprovalGate>`

What if approval is only needed sometimes? Low-risk deploys sail through; high-risk deploys need a human. [`<ApprovalGate>`](/components/approval-gate) wraps [`<Branch>`](/concepts/control-flow) + [`<Approval>`](/components/approval) into a single component:

```tsx
<ApprovalGate
  id="deploy-approval"
  output={outputs.deployDecision}
  when={risk.level === "high"}
  request={{ title: "Approve high-risk deploy?" }}
  onDeny="fail"
/>
```

When `when` is `false`, the gate auto-approves immediately. When `true`, it pauses for human review. Both paths write a valid `ApprovalDecision` to the same output. See the [ApprovalGate component reference](/components/approval-gate).

## Escalation: `<EscalationChain>`

When the question isn't "should a human decide?" but "which agent should handle this, and what happens when it can't?" — use [`<EscalationChain>`](/components/escalation-chain). It runs [agents](/concepts/agents-and-tools) in sequence, escalating to the next level when one fails or its `escalateIf` predicate returns `true`. Optionally, the final level is a human approval fallback:

```tsx
<EscalationChain
  id="support"
  escalationOutput={outputs.escalation}
  humanFallback
  levels={[
    { agent: fastAgent, output: outputs.tier1, escalateIf: (r) => r.confidence < 0.7 },
    { agent: powerAgent, output: outputs.tier2, escalateIf: (r) => r.confidence < 0.9 },
  ]}
>
  Resolve this ticket: {ctx.input.ticketBody}
</EscalationChain>
```

See the [EscalationChain component reference](/components/escalation-chain).

## Selection and Ranking

Not every human decision is yes/no. [`<Approval>`](/components/approval) supports `mode="select"` and `mode="rank"` for choosing among options:

```tsx
<Approval
  id="pick-plan"
  mode="select"
  output={outputs.selection}
  request={{ title: "Pick a rollout plan" }}
  options={[
    { key: "canary", label: "Canary" },
    { key: "regional", label: "Regional" },
  ]}
/>
```

`mode="select"` returns `{ selected: string, notes: string | null }`. `mode="rank"` returns `{ ranked: string[], notes: string | null }`. See the [Approval component reference](/components/approval) for the schemas.

## Scoped Approvals and Auto-Approval

For sensitive approvals, restrict who can decide with `allowedScopes` and `allowedUsers`. For approvals that become routine, use `autoApprove` to skip the human after enough consecutive manual approvals:

```tsx
<Approval
  id="deploy-prod"
  output={outputs.deployApproval}
  request={{ title: "Deploy to production?" }}
  allowedUsers={["user:oncall", "user:release-manager"]}
  autoApprove={{ after: 5, audit: true }}
/>
```

`autoApprove` also supports `condition` (auto-approve when a predicate is true) and `revertOn` (revert to human approval when conditions change). See the [Approval component reference](/components/approval) for the full `ApprovalAutoApprove` type.

## Next Steps

- [Approvals](/concepts/approvals) — Approval nodes, denial policies, and decision values as workflow data.
- [Approval Component](/components/approval) — Full API reference for `<Approval>`.
- [ApprovalGate Component](/components/approval-gate) — Conditional approval gates.
- [HumanTask Component](/components/human-task) — Structured human input with schema validation.
- [EscalationChain Component](/components/escalation-chain) — Multi-level agent escalation with human fallback.
- [Suspend and Resume](/concepts/suspend-and-resume) — The underlying durability model.
- [Workflow Approval Example](/examples/workflow-approval) — An end-to-end approval flow.

---

## Suspend and Resume

> How Smithers workflows pause, persist state, and resume execution after crashes, approvals, and interruptions.
> Source: https://smithers.sh/concepts/suspend-and-resume

You have a workflow with twelve steps. Steps one through six took forty minutes and burned real money on agent calls. Step seven crashes. You re-run the workflow. It starts at step one.

That is the problem. Suspend and resume is the solution.

## The Durability Contract

The rule fits in one sentence:

> A completed task is never re-executed. When a workflow resumes, it picks up from the first incomplete task.

Think of it like a save game. Every time a task finishes, Smithers writes the result to disk. If the power goes out, you do not replay the entire game from the title screen. You reload your last save and keep going.

So that forty-minute, twelve-step workflow that crashed at step seven? You resume from step seven. Steps one through six are done. Their outputs are already in [SQLite](https://sqlite.org/wal.html). You do not pay for them again.

## How State Is Preserved

Every task output is written to SQLite immediately on completion, keyed by `(runId, nodeId, iteration)`. When you resume a run, Smithers does five things:

1. **Loads existing [state](/concepts/workflow-state)** -- Reads run metadata, node states, and attempt history from SQLite
2. **Validates the environment** -- Checks that the workflow file hash and VCS revision match the original run
3. **Cleans up stale work** -- Cancels any in-progress attempts older than 15 minutes
4. **Re-renders** -- Builds the JSX tree with persisted outputs already in context
5. **Continues** -- Schedules and executes remaining incomplete tasks

No magic. The database is the source of truth, and the resume logic walks it forward.

## Three Ways Workflows Pause

There are exactly three reasons a workflow stops before it finishes: something broke, someone needs to decide, or you told it to stop.

### 1. Crash Recovery

The process dies. Maybe the machine ran out of memory. Maybe you hit Ctrl-C at the wrong moment. Either way, some tasks are stuck in `in-progress` with no process behind them.

On resume, Smithers handles this automatically:

```bash
# Start a run — crashes midway through "implement"
smithers up workflow.tsx --run-id run-1 --input '{"repo": "/my-project"}'

# "analyze" finished, "implement" was in-progress, "report" was pending
# Resume picks up from "implement"
smithers up workflow.tsx --run-id run-1 --resume true
```

"But what about that stuck in-progress task?" Good question. In-progress attempts older than 15 minutes are marked cancelled and retried. This is a deliberate tradeoff: it prevents zombie tasks from blocking the workflow forever, while still giving legitimately long-running tasks room to finish.

### 2. Approval Gates

Some steps should not proceed without a human saying yes. When a workflow reaches an [`<Approval>`](/components/approval) node or a `<Task needsApproval>`, it pauses durably until someone decides:

```tsx
<Approval
  id="deploy-approval"
  output={outputs.approval}
  request={{
    title: "Deploy to production?",
    summary: "All checks passed. Ready to ship.",
  }}
  onDeny="fail"
/>
```

The workflow enters `waiting-approval` status. Nothing runs. Nothing times out. It waits as long as it needs to. Resolve it from the CLI:

```bash
smithers approve run-1 --node deploy-approval --note "Ship it"
smithers up workflow.tsx --run-id run-1 --resume true
```

Or deny it:

```bash
smithers deny run-1 --node deploy-approval --note "Blocked by QA"
```

See [Human-in-the-Loop](/concepts/human-in-the-loop) or [Approvals](/concepts/approvals) for the full pattern.

### 3. Manual Cancellation

Sometimes you want to stop a workflow on purpose -- maybe you realized the input was wrong, or you need the machine for something else. Cancel now, resume later:

```bash
# Cancel a running workflow
smithers cancel run-1

# Later, resume from where it stopped
smithers up workflow.tsx --run-id run-1 --resume true
```

The workflow picks up where it left off, as if nothing happened.

## What Gets Skipped on Resume

This table is worth memorizing, or at least bookmarking:

| Node state before resume | Behavior on resume |
| --- | --- |
| `finished` | Skipped. Output exists and is valid. |
| `skipped` | Remains skipped. |
| `failed` (retries exhausted) | Stays failed unless workflow code now allows more retries. |
| `in-progress` (stale, >15 min) | Cancelled, then retried as `pending`. |
| `in-progress` (recent) | Left in-progress. Will time out and be cleaned up on next resume. |
| `pending` | Scheduled for execution. |
| `waiting-approval` | Stays waiting. Approve or deny to unblock. |
| `cancelled` | Stays cancelled. |

The logic is straightforward: finished work stays finished, pending work gets scheduled, and anything stuck in limbo gets cleaned up. No surprises.

## Resuming Programmatically

The CLI is fine for manual recovery. For automation, use the [`runWorkflow`](/runtime/run-workflow) API directly:

```ts
import { runWorkflow } from "smithers-orchestrator";
import workflow from "./workflow";

// Initial run
const result1 = await runWorkflow(workflow, {
  runId: "my-run",
  input: { repo: "/my-project" },
});

// result1.status might be "failed" or "waiting-approval"

// Resume later
const result2 = await runWorkflow(workflow, {
  runId: "my-run",
  resume: true,
});
// result2 picks up from where result1 left off
```

Same contract, different interface. The `runId` is the thread that ties the two calls together.

## Stable Task IDs

Here is where most people trip up.

Resumability depends on **stable, deterministic task identity**. The `id` prop on each `<Task>` becomes the durable key in SQLite. If the key changes between runs, Smithers cannot find the old output. It treats the task as new and runs it from scratch.

```tsx
// Good — stable, descriptive IDs
<Task id="analyze" output={outputs.analysis}>...</Task>
<Task id={`${ticket.id}:implement`} output={outputs.implement}>...</Task>

// Bad — IDs that change between renders
<Task id={`task-${Math.random()}`} output={outputs.analysis}>...</Task>
<Task id={`task-${index}`} output={outputs.analysis}>...</Task>
```

Why is `task-${index}` bad? Because if you insert a new item at the beginning of a list, every index shifts. Task 3 becomes task 4, and suddenly Smithers loads task 4's old output into the wrong context. This is the same problem React has with list keys, and the fix is the same: derive keys from the data, not the position.

**Rules for stable IDs:**
- Use fixed strings for static tasks: `id="analyze"`, `id="report"`
- Derive IDs from stable data for dynamic tasks: `id={`${ticket.id}:implement`}`
- Never use array indices, timestamps, or random values

## Loop State Persistence

Loops are where durability really earns its keep. If a workflow crashes mid-loop, you do not want to replay every completed iteration.

And you do not have to:

- Completed iterations are preserved (each has its own output row)
- The loop resumes from the incomplete iteration
- `ctx.latest()` correctly returns the most recent completed output

```tsx
<Loop until={approved} maxIterations={5}>
  <Sequence>
    <Task id="implement" output={outputs.implement} agent={coder}>...</Task>
    <Task id="review" output={outputs.review} agent={reviewer}>...</Task>
  </Sequence>
</Loop>
```

If the process crashes after iteration 2's `implement` but before `review`, resuming picks up at iteration 2's `review` task. Iterations 0 and 1 are untouched. Their outputs sit in SQLite, ready for anything that needs them.

## Environment Validation

"What if I fix a bug in my workflow and then resume?" Smithers will not let you.

On resume, Smithers checks that:

- The workflow file hash matches the original run
- The VCS revision matches (if tracked)

If either changed, resume is rejected. This is intentional. Resuming a run with a different workflow definition could produce inconsistent state -- imagine step eight reading outputs from steps one through seven that were produced by different code.

To fix a bug and retry, start a fresh run:

```bash
# Fix the workflow code, then start new
smithers up workflow.tsx --input '{"repo": "/my-project"}'
```

It costs you the re-execution, but it guarantees consistency. That is a trade worth making.

## Next Steps

- [Approvals](/concepts/approvals) -- Explicit approval nodes and denial policies.
- [Human-in-the-Loop](/concepts/human-in-the-loop) -- Approval gates, denial policies, and multi-step approval patterns.
- [Workflow State](/concepts/workflow-state) -- The durable state model resume loads back into memory.
- [Resumability Guide](/guides/resumability) -- Practical tips for designing resumable workflows.
- [Execution Model](/concepts/execution-model) -- The internal execution loop that drives suspend and resume.

---

## Approvals

> Model human review as an explicit durable gate in the workflow graph.
> Source: https://smithers.sh/concepts/approvals

What happens when your workflow reaches a point where a human needs to say "yes" or "no"?

You could bolt a `needsApproval: true` flag onto a task and let the scheduler figure it out. But think about what that actually means. The approval isn't a property of the task -- it's a *separate decision* with its own lifecycle, its own persistence requirements, and its own downstream consequences. Treating it as a boolean flag hides all of that.

Approvals should be explicit workflow nodes, not a boolean flag on an otherwise normal step.

From first principles, an approval is:

- a durable request for a human decision
- a suspended execution state
- an audited decision record
- a dependency that downstream work can wait on

That makes approvals a workflow primitive in their own right.

## Approval as a Node

Instead of:

```ts
// Old shape
needsApproval: true
```

approvals are explicit [`<Approval>`](/components/approval) workflow nodes.

**In JSX:**

```tsx
import {
  Approval,
  Sequence,
  Task,
  Workflow,
  approvalDecisionSchema,
  createSmithers,
} from "smithers-orchestrator";
import { z } from "zod";

const { smithers, outputs } = createSmithers({
  approval: approvalDecisionSchema,
  published: z.object({ status: z.string() }),
});

export default smithers((ctx) => {
  const approval = ctx.outputMaybe(outputs.approval, { nodeId: "approve-deploy" });

  return (
    <Workflow name="deploy">
      <Sequence>
        <Approval
          id="approve-deploy"
          output={outputs.approval}
          request={{
            title: "Deploy to production?",
            summary: "Human review required before release.",
          }}
          onDeny="continue"
        />

        {approval ? (
          <Task id="record" output={outputs.published}>
            {{ status: approval.approved ? "approved" : "rejected" }}
          </Task>
        ) : null}
      </Sequence>
    </Workflow>
  );
});
```

Look at what you gain. The `Approval` node is right there in the graph, visible to anyone reading the workflow. It has an id, an output type, a request, and an explicit denial policy. Nothing is hidden.

This is better for three reasons:

1. the approval is visible in the graph
2. the execution can [suspend on a first-class durable node](/concepts/suspend-and-resume)
3. the decision can be reused or branched on explicitly

## What an Approval Produces

An approval gate should resolve to a typed decision object:

```ts
type ApprovalDecision = {
  readonly approved: boolean;
  readonly note: string | null;
  readonly decidedBy: string | null;
  readonly decidedAt: string | null;
};
```

Not just a boolean. A full record: who decided, when, and why. Downstream nodes can depend on the decision as data, not only as scheduler side effects.

"Why does `decidedBy` matter in the type?" Because six months from now, when someone asks who approved the deploy that broke prod, you want the answer in the workflow output -- not buried in a Slack thread.

## Lifecycle

An approval node should move through a durable lifecycle:

```txt
pending
  -> requested
  -> waiting-approval
  -> approved | denied
  -> completed | failed | routed
```

More concretely:

1. Smithers reaches the approval node.
2. It persists an approval request record.
3. The execution suspends in a durable waiting state.
4. A human approves or denies the request.
5. The node resolves according to its policy.
6. Downstream nodes become eligible to run.

Step 3 is where the magic -- or rather, the engineering -- happens. The process can crash, restart, sleep for a week. When it comes back, the approval request is still there in durable storage, and the execution picks up where it left off.

If the process restarts while waiting, the approval request still exists and the execution can resume later.

## Request Shape

The `request` function should be pure and derived from already-computed workflow data:

```ts
request: ({ build }) => ({
  title: `Deploy ${build.version}?`,
  summary: build.plan,
  metadata: {
    risk: build.risk,
    commitSha: build.commitSha,
  },
})
```

That request is what Smithers persists, displays in UIs, and exposes through CLI or API tooling.

Notice: the request doesn't reach out to external systems or compute new data. It takes what the workflow already knows and shapes it into something a human can act on. Pure function of upstream outputs. That's what makes it safe to persist and [replay](/concepts/time-travel).

## Denial Policies

What should happen when someone says "no"? That depends entirely on context. A compliance gate should halt the workflow. A review gate might just record the rejection and move on.

An approval gate should make denial behavior explicit.

### `onDeny: "fail"`

The workflow fails when the gate is denied.

This is appropriate for destructive or compliance-sensitive actions.

### `onDeny: "continue"`

The gate resolves to a denial decision and the workflow continues. Downstream logic can [branch](/concepts/control-flow) on that value.

### `onDeny: "skip"`

The protected branch is skipped, but the rest of the workflow continues.

The important part is that denial handling is declared in the workflow, not buried inside scheduler heuristics. You read the workflow definition and know exactly what "denied" means for each gate.

## Branching on Approval

Because approvals are values, not only control flags, you can route explicitly:

```ts
const approval = ctx.outputMaybe(outputs.approval, { nodeId: "approve-release" });

return (
  <Workflow name="publish">
    <Approval
      id="approve-release"
      output={outputs.approval}
      request={{
        title: `Publish ${report.title}?`,
        summary: report.body,
      }}
      onDeny="continue"
    />

    {approval ? (
      <Branch
        if={approval.approved}
        then={<Task id="publish" output={outputs.published}>{{ status: "published" }}</Task>}
        else={<Task id="record-rejection" output={outputs.rejected}>{{ status: "rejected" }}</Task>}
      />
    ) : null}
  </Workflow>
);
```

This keeps the approval decision explicit in the rendered graph instead of hiding it inside scheduler-only state.

Read the `Branch`. If approved, publish. If denied, record the rejection. Both paths are visible in the workflow definition, both produce typed outputs, both are part of the graph. No hidden conditional logic in the scheduler.

## Storage Model

Approval state belongs to Smithers-managed [workflow state](/concepts/workflow-state), not to domain models.

Smithers should persist at least:

- approval node id
- execution id
- current status
- request payload
- decision payload
- timestamps
- actor identity

That gives you:

- resumability
- auditability
- UI/API query support
- explicit graph semantics

This is not incidental bookkeeping. It's the foundation for answering "who approved what, when, and why" -- the question every production system eventually needs to answer.

## Approvals and Effect Primitives

Approvals map naturally onto durable Effect concepts:

- a request record is persisted metadata
- the waiting state is a durable suspension point
- the decision behaves like a durable deferred value

Smithers should compile approval nodes onto durable primitives rather than inventing bespoke in-memory waiting logic.

Why does this matter? Because in-memory waiting dies when the process dies. Durable primitives survive restarts by design. An approval that might wait hours or days *must* be durable. The abstraction choice isn't academic -- it determines whether your approvals actually work in production.

## Notifications and Automation

An approval sitting in durable storage is useless if nobody knows about it.

Approval creation should emit a durable event so other systems can react:

- send Slack messages
- open a review UI
- create a Linear issue
- notify on-call engineers

The important boundary is:

- Smithers records the approval request durably
- external systems subscribe and notify
- the human decision flows back into the same durable gate

Smithers owns the state. External systems own the notification. Neither crosses into the other's territory.

## CLI and API Shape

The control plane should target approval nodes directly.

For example:

```bash
smithers approve <id> --node approve-deploy --note "Ship it"
smithers deny <id> --node approve-deploy --note "Blocked by QA"
```

The exact transport can vary, but the key should be `(runId, nodeId)`, not a hidden internal row id.

Simple, direct, auditable. The run ID tells you which execution. The node ID tells you which gate. The note tells you why. That's everything you need.

## Why Explicit Gates Matter

An explicit approval node is easier to reason about than a property on a task because it makes the workflow honest.

You can see:

- where human intervention is required
- what exactly is being approved
- how denial changes the graph
- what downstream work depends on the decision

That is the right abstraction for a durable workflow system.

When you look at a workflow graph and see an `Approval` node, you know immediately: the workflow pauses here for a human. You know what data the human sees. You know what happens on "yes" and what happens on "no." There's nothing to guess.

## Next Steps

- [Human-in-the-Loop](/concepts/human-in-the-loop) -- Patterns for gates, denials, and structured human input.
- [Approval Component](/components/approval) -- Full API reference for approval nodes.
- [Suspend and Resume](/concepts/suspend-and-resume) -- The durability model underneath approval waits.
- [Execution Model](/concepts/execution-model) -- Understand how suspended approval nodes fit into durable execution.
- [Workflow Approval Example](/examples/workflow-approval) -- See an end-to-end approval flow in practice.

---

## Caching

> Reuse step outputs when the same workflow inputs and dependencies appear again.
> Source: https://smithers.sh/concepts/caching

Why does your workflow call the same agent twice with the same inputs?

Think about it. If a step takes the same ticket ID, the same description, and the same model version -- and the model is deterministic enough for your purposes -- the second call is pure waste. You already have the answer. You paid for it once.

Caching in Smithers lets you say: "I've seen this before. Use the old result."

But here's the catch. Caching is only safe when you know *exactly* what "this" means. That's why Smithers caching is:

- **per-step** -- not a global switch on the whole workflow
- **explicit** -- you declare what matters in the cache key
- **derived from declared inputs and dependencies** -- no hidden state
- **validated against the current output model before reuse** -- stale shapes are rejected

It should not depend on hidden renderer state or ad hoc prompt hashing.

## Step-Level Caching

Caching belongs on the step that produces reusable work.

Why per-step and not per-workflow? Because different steps have different purity profiles. A summarization step might be safely cacheable. A deployment step never is. Putting the cache declaration on the step forces you to make that judgment where it matters.

For example:

```ts
cache: {
  by: ({ input }) => ({
    ticketId: input.ticketId,
    description: input.description,
  }),
  version: "analysis-v1",
}
```

This says:

- cache `analyze`
- key it from the declared `by` function
- invalidate old entries if the algorithm or provider changes by bumping `version`

That is a better fit than a global workflow-wide cache switch.

## Default Mental Model

Here is the one sentence you should internalize:

> A step is cacheable when it behaves like a pure function of its declared inputs.

If a step depends on:

- `input`
- `needs`
- stable service behavior

then Smithers can safely reuse a previous successful output for the same key.

If a step has hidden side effects or reads mutable external state, it should not be cached by default.

"But wait," you might think, "LLM calls aren't truly pure functions." Right. They're not. But for many use cases -- summarization, classification, structured extraction -- the outputs are stable *enough* that re-running them is waste, not value. You're caching the *work*, not asserting mathematical purity.

## What Goes Into the Cache Key

The default cache key should be derived from durable workflow structure plus the explicit `cache.by` payload.

Good key components:

- workflow name
- step id
- output model identity
- `cache.version`
- serialized value returned by `cache.by`

For example:

```ts
cache: {
  by: ({ input, analysis }) => ({
    repo: input.repo,
    summary: analysis.summary,
  }),
  version: "report-v2",
}
```

Notice what's *not* in there: service instances, runtime objects, opaque graphs. Those aren't serializable, aren't stable, and aren't yours to hash. You pick the data that determines the output. Nothing more.

This is better than attempting to hash arbitrary service graphs or opaque runtime objects.

## Why Explicit Keys Matter

Imagine debugging a production workflow where a step returned a stale result. With magic cache keys, you'd be spelunking through framework internals trying to figure out what the cache thought was "the same." With explicit keys, you open the step definition and read it.

Explicit keys make cache behavior reviewable.

You can answer:

- what exact data invalidates this step?
- did the provider/model change?
- does this step depend on hidden filesystem or network state?
- should this cache survive a workflow refactor?

Magic cache keys tend to break trust because users cannot predict when a step will reuse old work.

## Cache Validation

Here's a subtle problem. You cached a step's output last week. Since then, you changed the output schema -- added a field, tightened a type. The cached bytes still exist, keyed to the same inputs. Should Smithers blindly hand you the old shape?

No.

When Smithers finds a cache hit, it should still validate the cached payload against the current output model before reusing it.

That protects against:

- model shape changes
- decoding changes
- old invalid cache entries

If validation fails, Smithers should treat the entry as a miss and compute a fresh value.

This is the safety net that makes caching practical in a system where output schemas evolve.

## Example

```ts
<Task
  id="summarize"
  output={outputs.summary}
  agent={summarizer}
  deps={{ analysis: outputs.analysis }}
  cache={{
    by: ({ analysis }) => ({
      summary: analysis.summary,
      severity: analysis.severity,
    }),
    version: "summary-v1",
  }}
>
  {({ analysis }) => `Write a current-status summary from this analysis:\n\n${analysis.summary}`}
</Task>
```

If the same analysis appears again with the same cache version, Smithers can reuse the persisted `summary` row instead of calling the agent again.

Read the `by` function. It tells you everything: this step's output depends on the analysis summary and the severity. Change either one, and you get a fresh call. Change neither, and you get the cached result. No guessing.

## What Should Not Be Cached

Ask yourself: "If I replayed the cached output instead of running this step, would anything be wrong?"

For a summarization step, no -- you'd get the same summary. For a deployment step, absolutely yes -- you'd skip the actual deploy.

Caching is a bad fit for steps that are primarily about side effects.

Examples:

- deploy to production
- send email
- mutate a Git branch
- call an external system whose current state matters
- open an approval request

Those should either disable caching entirely or use explicit idempotency semantics separate from normal output caching.

## Caching and Effect Services

Effect services are not part of the cache key automatically.

That is intentional. Service instances are often not serializable or stable.

If service behavior affects the output, encode that in the cache version:

```ts
cache: {
  by: ({ input }) => ({ prompt: input.prompt }),
  version: "anthropic-sonnet-4-2025-02",
}
```

When you switch from Sonnet to Opus, bump the version string. The old cache entries become misses. This keeps invalidation under user control.

You might wish Smithers could detect the model change automatically. But "which service details matter" is a judgment call only you can make. A logging service change doesn't invalidate outputs. A model provider change does. Smithers can't know that -- so it asks you to say it.

## Runtime Behavior

With caching enabled on a step, Smithers should:

1. compute the cache key before executing the step
2. look for a previously successful cached output
3. validate the cached payload against the output model
4. if valid, mark the step as completed from cache
5. otherwise run the step normally and persist the fresh output

From the rest of the workflow's perspective, a cache hit behaves the same as a completed upstream step.

That last point matters. Downstream steps don't know whether their dependency was computed fresh or pulled from cache. They see a completed step with a valid output. The abstraction is clean.

## Resume vs Cache

Resuming and caching are related but distinct.

### Resume

Resume reuses outputs from the same execution id.

Your workflow crashed at step 4. You restart it. Steps 1 through 3 already succeeded in this run, so Smithers skips them. That's resume -- replaying within a single execution.

### Cache

Cache reuses outputs across different executions when the declared cache key matches.

You run a new workflow with the same ticket. The analysis step sees a cache hit from last Tuesday's run. That's caching -- reuse across executions.

Resume is about durability. Cache is about recomputation.

## Storage

Cached outputs should live in Smithers-managed metadata, keyed by:

- workflow id
- step id
- cache key
- cache version

This metadata belongs to the framework, not to user domain models.

## Suggested Rule

Only cache a step if you would be comfortable describing it as:

> "Given these explicit inputs, I want the same persisted output back."

If that sentence feels false, the step probably should not use caching.

## Next Steps

- [Planner Internals](/concepts/planner-internals) -- See how the workflow graph is planned and scheduled internally.
- [Execution Model](/concepts/execution-model) -- See where cache lookup happens in the durable step lifecycle.
- [Runtime Events](/runtime/events) -- This page will need cache-hit and cache-miss events in the new design.

---

## Data Model

> How Smithers stores input payloads, task outputs, and internal workflow metadata.
> Source: https://smithers.sh/concepts/data-model

Every database you have ever cursed at got that way for the same reason: someone mixed bookkeeping with business data. Audit timestamps crept into domain objects. Retry counters leaked into API responses. One day you open the schema and cannot tell what the system *does* from what the system *needs to run*.

Smithers refuses to let that happen.

It stores three kinds of data, and it keeps them apart on purpose:

1. the run input payload
2. task output rows
3. internal workflow metadata

Why does the separation matter? Because the question "what did the task produce?" and the question "how many attempts did it take?" have different audiences, different lifecycles, and no business sharing a table. You will thank this design the first time you query your outputs without wading through orchestration columns.

## Run Input

When you kick off a workflow, you hand it a payload. That payload is the *entire* context your workflow gets from the outside world:

```ts
const result = await runWorkflow(workflow, {
  input: { description: "Auth tokens expire silently" },
});
```

Think of it as the function argument for the whole run. Smithers persists it once, and every task in the workflow reads it through `ctx.input`. If the run crashes and resumes, the same input is still there -- unless you explicitly override it.

So what should go in `input`? Three things:

- user-supplied run context
- durable across resume
- available everywhere through `ctx.input`

Nothing more. If a value is produced *during* the run, it belongs in a task output, not in the input.

## Task Outputs

Here is where your domain data lives. Most Smithers workflows define output schemas up front with `createSmithers(...)`:

```tsx
const { Workflow, Task, smithers, outputs } = createSmithers({
  analysis: z.object({
    summary: z.string(),
    severity: z.enum(["low", "medium", "high"]),
  }),
});
```

Notice that the schema describes *your* data -- a summary string and a severity level. That is it. No run IDs, no iteration counters, no attempt numbers. You define the shape of the answer; Smithers handles everything else.

Behind the scenes, each schema key becomes a durable SQLite table. Smithers automatically:

- creates the SQLite table
- maps the schema key to a snake_case table name
- adds `runId`, `nodeId`, and `iteration` bookkeeping columns
- validates agent output before persisting it

Your prompt-facing schema stays clean:

```ts
z.object({
  summary: z.string(),
  severity: z.enum(["low", "medium", "high"]),
})
```

"Wait," you might be thinking, "if Smithers adds bookkeeping columns anyway, why can't I just add them myself?" You *can*. But then your LLM prompt includes fields it should never fill, your validation conflates domain rules with runtime plumbing, and your query results mix what the task *said* with how it got there. You do not need to add fields like:

- `runId`
- `nodeId`
- `iteration`
- `attempt`
- approval metadata

Smithers owns those. Let it.

## Identity of an Output Row

Here is a subtlety that trips people up. Two different tasks can write to the same output schema. The same task can write to it ten times inside a loop. So how does Smithers know which row is which?

The answer: output identity is not "table name only." Each row is keyed by:

- run id
- task id (`nodeId`)
- iteration when the task is inside a loop

That is why `ctx.output(...)`, `ctx.outputMaybe(...)`, and `ctx.latest(...)` all require both an output target and a `nodeId`. The table tells Smithers *where* to look. The node ID and iteration tell it *which row* you mean.

## Custom Drizzle Tables

Sometimes you already have a table, or you need a schema that Smithers cannot auto-generate. In that case, `<Task output={...}>` can point at a custom Drizzle table.

Fair warning: when you go this route, you take on responsibility that Smithers normally handles for you:

- creating and migrating the table
- including Smithers bookkeeping columns such as `runId` and `nodeId`
- including `iteration` in looped tasks
- optionally pairing the table with `outputSchema` for stricter validation

This is an escape hatch, not the default path. If `createSmithers(...)` can express your schema, use it.

## Internal Smithers Metadata

Open your database and you will see tables prefixed with `_smithers_`. Do not be alarmed. These are Smithers' own operational tables:

| Table | Purpose |
|-------|---------|
| `_smithers_runs` | One row per workflow run. Tracks status, heartbeat, VCS revision, and error. |
| `_smithers_nodes` | Current state of each task node within a run (pending, running, finished, failed). |
| `_smithers_attempts` | Every execution attempt for every node, including start/finish timestamps and error detail. |
| `_smithers_frames` | The rendered JSX tree at each commit boundary, stored as serialized XML. |
| `_smithers_approvals` | Approval requests and decisions for tasks gated by `<Approval>`. |
| `_smithers_human_requests` | Human-in-the-loop requests (form fills, confirmations) and their responses. |
| `_smithers_cache` | Cached task outputs keyed by workflow, node, schema signature, and agent signature. |
| `_smithers_sandboxes` | Sandbox session metadata for bubblewrap and container-based execution. |
| `_smithers_tool_calls` | Per-call log of every tool invocation: input, output, latency, and status. |
| `_smithers_events` | Sequential event journal for a run. Source of truth for all observable events. |
| `_smithers_ralph` | Loop (`<Loop>`) iteration counters and completion flags. |
| `_smithers_cron` | Cron schedule definitions, last-run and next-run timestamps. |
| `_smithers_scorers` | Scorer results for each task attempt: score, reason, and latency. |
| `_smithers_vectors` | RAG vector store: chunk text, embeddings (as BLOBs), and metadata. |
| `_smithers_signals` | Inbound signals received by waiting runs. |

This is the machinery that lets Smithers resume a crashed run, retry a failed task, or tell you exactly what happened at 3 a.m. It exists so your output tables never have to carry orchestration concerns.

## Table Schema Ensurance and Auto-Migration

Smithers calls `ensureSmithersTables()` at startup, which runs `CREATE TABLE IF NOT EXISTS` for every internal table. You never need to run migrations by hand for `_smithers_*` tables.

For your own output tables defined via `createSmithers(...)`, Smithers also auto-migrates columns. When the Drizzle schema defines a column that is missing from the SQLite table on disk, Smithers issues an `ALTER TABLE ... ADD COLUMN` statement to add it. Columns that exist in the database but are absent from the schema are left in place -- Smithers does not remove data.

This forward-only migration means you can add fields to an output schema and existing runs will continue to work. Removing a field or changing a column type requires a manual migration or a fresh database.

## Schema Signature Verification

Before persisting a cached task result, Smithers computes a schema signature for the output table. The signature is a SHA-256 hash of the table name and every column's name, type, nullability, and primary key flag, all sorted alphabetically:

```
sha256("tableName|colA:text:1:0|colB:integer:0:0|...")
```

This hash is stored as `schema_sig` in `_smithers_cache`. When a cached result is retrieved, Smithers checks that the current table's signature still matches. If the schema changed since caching, the cached entry is ignored and the task runs fresh. You never get silently stale cache hits after a schema migration.

## Transaction Model

`SmithersDb` uses a single-writer transaction model with a serial promise queue. Every write operation (including those outside an explicit transaction) acquires a turn in a `transactionTail` promise chain before proceeding. This serializes all writes even when multiple Effect fibers run concurrently.

Explicit transactions use `BEGIN IMMEDIATE` so SQLite acquires a write lock immediately, preventing lock contention with concurrent readers:

```ts
await adapter.withTransaction("my-write-group", effect);
```

Nested transactions from the same fiber are detected and rejected -- SQLite does not support true savepoints through this interface. The transaction depth counter and owner-thread tracking ensure the same fiber can perform multiple writes within a single transaction without re-acquiring the queue turn.

## Write Retry and Exponential Backoff

All write paths wrap the underlying operation with `withSqliteWriteRetryEffect`. When a write fails with `SQLITE_BUSY`, `SQLITE_IOERR`, "database is locked", or "disk i/o error", Smithers retries up to six times with exponential backoff:

- Base delay: 50 ms
- Maximum delay: 2,000 ms
- Jitter: ±25% of the computed delay
- Each retry increments the `smithers.db.retries` counter

After the maximum number of retries, the error propagates as a `DB_WRITE_FAILED` SmithersError. This makes Smithers resilient to transient WAL-mode lock contention without requiring any configuration.

## Frame Codec

Render frames in `_smithers_frames` are stored in one of three encodings:

| Encoding | When used | Description |
|----------|-----------|-------------|
| `full` | Frame 0 and any keyframe | Complete serialized XML of the render tree |
| `delta` | Frames between keyframes | JSON patch (set, insert, remove ops) relative to the previous frame |
| `keyframe` | Every 50th frame | Same as `full`; resets the delta chain |

The keyframe interval is 50 frames (`FRAME_KEYFRAME_INTERVAL = 50`). Reading an arbitrary frame requires loading the nearest preceding keyframe and applying all deltas up to the target frame number. An in-memory LRU cache (up to 512 entries) stores reconstructed frame XML so repeated reads of hot frames are free.

Delta encoding uses a structural diff algorithm that walks the XML JSON tree, emitting `set`, `insert`, and `remove` operations. It is node-ID-aware: when comparing adjacent objects in the tree, it uses the `id` prop of element nodes as a stable identity anchor, so reordered elements produce insert/remove pairs rather than spurious updates.

## Signal Persistence

Signals are external messages sent to a running workflow. When a signal arrives, Smithers writes it to `_smithers_signals` with an automatically allocated sequence number. You never pick the `seq` yourself -- Smithers computes `MAX(seq) + 1` inside a `BEGIN IMMEDIATE` transaction so two concurrent signals never collide.

Before inserting, the adapter checks whether an identical signal already exists (same `runId`, `signalName`, `correlationId`, `payloadJson`, `receivedAtMs`, and `receivedBy`). If a match is found, the existing `seq` is returned and no duplicate row is created. This deduplication prevents replay or retry from doubling signals.

### Signal Query Filters

Querying signals supports four filters, all optional:

| Filter | Column | Description |
|--------|--------|-------------|
| `signalName` | `signal_name` | Match a specific signal type |
| `correlationId` | `correlation_id` | Match a specific correlation key (supports `null`) |
| `receivedAfterMs` | `received_at_ms` | Only signals received at or after this timestamp |
| `limit` | -- | Max rows to return (default 200) |

Results are ordered by `seq ASC`, so you always see signals in arrival order.

## Event Persistence

The `_smithers_events` table is the durable event journal for each run. Every `SmithersEvent` emitted during execution is persisted here with a sequential `seq` number that serves as the total ordering.

### Auto-Sequence Allocation

Like signals, events get their `seq` via `SELECT COALESCE(MAX(seq), -1) + 1` inside a `BEGIN IMMEDIATE` transaction. This guarantees gap-free, monotonically increasing sequence numbers per run.

### Insert Deduplication

Before inserting, the adapter checks for an existing row matching the same `runId`, `timestampMs`, `type`, and `payloadJson`. If found, the existing `seq` is returned without creating a duplicate. This makes event insertion idempotent across retries.

### Event Queue and Flush

For performance, events can be enqueued asynchronously via `emitEventQueued`. The event is emitted to listeners and tracked immediately, but database and log-file persistence happens in a background promise chain (`persistTail`). Call `flush()` to await all queued persistence -- the engine does this at task boundaries and run completion to ensure nothing is lost.

### Sequence Start Override

The `EventBus` constructor accepts a `startSeq` option, which sets the initial sequence counter. This is used on resume to continue from where the previous run left off, preventing sequence number collisions with already-persisted events.

### Event History Queries

The adapter supports filtered history queries with these parameters:

| Filter | Description |
|--------|-------------|
| `afterSeq` | Return events with `seq > afterSeq` |
| `limit` | Max rows |
| `nodeId` | Filter by `$.nodeId` inside the payload JSON |
| `types` | Filter to specific event type strings |
| `sinceTimestampMs` | Events at or after this timestamp |

A separate `countEventHistory` method returns the count matching the same filters, useful for pagination.

## Human Request Persistence

Human requests (form fills, confirmations, free-text prompts) are stored in `_smithers_human_requests` with lifecycle states: `pending`, `answered`, `cancelled`, `expired`.

### Pending Inbox Query

`listPendingHumanRequests` returns all pending requests across all runs, joined with `_smithers_runs` and `_smithers_nodes` to include the `workflowName`, `runStatus`, and `nodeLabel`. Before returning, it automatically expires any requests whose `timeoutAtMs` has passed, transitioning them to `expired` status.

### Answer Persistence

`answerHumanRequest` sets the response JSON, timestamp, and optional `answeredBy` field, transitioning the request from `pending` to `answered`. Only pending requests can be answered -- the `WHERE status = 'pending'` clause prevents double-answering.

### Cancellation

`cancelHumanRequest` transitions a pending request to `cancelled`. Like answering, it only operates on requests in `pending` status.

## Cron Persistence

Cron schedules are stored in `_smithers_cron` and managed through the adapter:

| Operation | Method | Description |
|-----------|--------|-------------|
| Create/Update | `upsertCron` | Inserts or updates a cron schedule by `cronId` |
| List | `listCrons(enabledOnly?)` | Returns all cron entries, optionally filtering to `enabled = true` |
| Track execution | `updateCronRunTime` | Updates `lastRunAtMs`, `nextRunAtMs`, and optional `errorJson` |
| Delete | `deleteCron` | Removes a cron entry by ID |

The `enabled` flag allows disabling a schedule without deleting it. The `lastRunAtMs` and `nextRunAtMs` columns let the scheduler know when to fire next without recomputing from the cron pattern on every poll. If a scheduled run fails, the error is stored in `errorJson` on the cron row for diagnostics.

## Run Lifecycle Management

### Stale Run Claims

The supervisor detects stale runs by querying `_smithers_runs` for rows with `status = 'running'` whose `heartbeat_at_ms` is older than the stale threshold (default 30 seconds). To safely resume a stale run without races, the supervisor uses a compare-and-swap pattern:

1. **Claim**: `claimRunForResume` atomically sets `runtime_owner_id` and `heartbeat_at_ms` only if the current values match the expected stale state. The `WHERE` clause checks `runtime_owner_id`, `heartbeat_at_ms`, and the stale threshold in a single `UPDATE`, and returns whether the row was modified.

2. **Release**: If the supervisor decides not to resume after claiming, `releaseRunResumeClaim` restores the original `runtime_owner_id` and `heartbeat_at_ms`, but only if the claim is still held (the current `runtime_owner_id` matches the claimer).

This two-phase claim prevents two supervisor instances from resuming the same stale run simultaneously.

### Sandbox Tracking

Sandbox sessions (bubblewrap, Docker, or Codeplane) are tracked in `_smithers_sandboxes`. The adapter upserts sandbox rows keyed by `(runId, sandboxId)`, recording runtime type, configuration, status, shipping and completion timestamps, and bundle paths.

## Output Edge Cases

### Payload-Only Tables

When an output table's only non-bookkeeping column is `payload`, Smithers detects it and wraps the entire agent output into that single column instead of spreading fields across multiple columns. This is useful for unstructured or polymorphic outputs where a fixed column set does not make sense.

### Boolean Column Coercion

Bun's SQLite driver returns raw `0`/`1` integers for columns declared with `{ mode: "boolean" }` in Drizzle. When loading output snapshots, Smithers detects these columns by inspecting the Drizzle table metadata and coerces the integer values to proper JavaScript booleans. Without this, strict equality checks like `value === true` would fail.

### Schema Key Aliasing

When loading outputs via `loadOutputs`, each result set is stored under **both** the schema key (e.g., `"analysis"`) and the actual SQLite table name (e.g., `"analysis"` or a custom name). This dual indexing lets downstream code reference outputs by either name, which matters when schema keys and table names diverge (e.g., with custom Drizzle tables).

## Snapshot Persistence

Loading a complete workflow snapshot (`loadInput` + `loadOutputs`) reconstructs the full `ctx` state from SQLite. The input is loaded by filtering the input table for the current `runId`. Outputs are loaded by iterating every schema key, querying each table for rows matching the `runId`, applying boolean coercion, and indexing under both schema key and table name.

This snapshot is what powers resume: when a crashed run restarts, the snapshot populates `ctx` so the JSX tree renders with all completed outputs already in place.

## Transaction Internals

### Read Gating

Reads, not just writes, also acquire a turn in the `transactionTail` promise queue. This prevents reads from seeing intermediate state during a multi-statement transaction. If the current fiber already owns the active transaction, reads proceed immediately without acquiring a new turn.

### Commit Retry

The entire `withTransaction` call is wrapped in `withSqliteWriteRetryEffect`. If the `COMMIT` (or `BEGIN IMMEDIATE`) fails with `SQLITE_BUSY` or an I/O error, the retry mechanism rolls back and retries the full transaction from `BEGIN`, using the same exponential backoff as standalone writes.

## Why the Separation Matters

Ask two questions about any completed task:

Your workflow output answers: *what did this task produce?*

Smithers metadata answers:

- when did it run?
- how many attempts did it take?
- was it cached?
- did it wait for approval?
- which loop iteration produced it?

These are fundamentally different concerns. Mixing them is like storing a book's page count in the same field as its ISBN -- technically possible, obviously wrong. Keep them apart and both stay easy to reason about.

## Schema Changes

Changing a Zod output schema is not just a prompt tweak. It is a persistence change. The table on disk has to match the schema in code.

Typical examples:

- adding a field
- removing a field
- changing a field type
- tightening validation rules

In hot-reload mode, Smithers blocks these changes and requires a restart so output resolution stays deterministic. This is deliberate friction -- it forces you to think about the migration before the data gets inconsistent.

If you use custom Drizzle tables, you must manage those migrations yourself.

## Direct Queries

Smithers does not hide SQLite from you. The database is right there. Open it, poke around, write queries.

Use output tables when you care about business results.
Use `_smithers_*` tables when you care about execution history.

This is one of the advantages of keeping the layers separate: you can hand your output tables to an analyst who has never heard of Smithers, and the data makes sense on its own.

## Mental Model

When in doubt, apply this rule of thumb:

- `ctx.input` is run-scoped input
- output tables hold validated task results
- `_smithers_*` tables hold orchestration state

If a field only exists to help the runtime schedule or resume work, it belongs in Smithers metadata, not in your domain schema. If a field describes what the task actually produced, it belongs in an output table, not in `_smithers_*`. The line is clean. Keep it that way.

## Next Steps

- [Execution Model](/concepts/execution-model) -- See how these tables participate in render, scheduling, and resume.
- [Structured Output](/guides/structured-output) -- Validation and persistence details for task outputs.
- [Debugging](/guides/debugging) -- Query the internal tables directly when a run behaves unexpectedly.

---

## Unidirectional Dataflow

> How Smithers' one-way data cycle makes workflows easy to write, read, debug, and evolve at three different levels.
> Source: https://smithers.sh/concepts/unidirectional-dataflow

Here is the single constraint that makes Smithers simple: **data flows in one direction.**

State produces a plan. The plan produces events. Events update state. The cycle repeats. There is never a backwards arrow. Once you internalize that, everything else -- branching, loops, hot reload, time travel -- falls out for free.

<img src="/images/unidirectional-dataflow.jpg" alt="Smithers unidirectional data flow diagram" />

## The Cycle

```
State  ──→  Execution Plan  ──→  Actions/Events
  ↑                                     │
  └─────────────────────────────────────┘
```

Five steps. Memorize them once.

1. **State** is persisted in SQLite -- task outputs, cursor position, iteration counts.
2. A **render function** maps that state to an execution plan (your JSX tree).
3. The engine executes the next ready task, which emits an **event**.
4. The event handler writes new state, triggering a **re-render**.
5. Go to 1.

That's it. Every Smithers workflow -- from a three-task pipeline to a multi-agent loop that spawns work dynamically -- runs on this cycle.

In simple workflows the plan never changes between renders. In complex ones, it evolves: tasks appear, branches switch, loops continue or stop. But the mechanism is always the same one-way loop.

<img src="/images/state-machine.jpg" alt="Smithers state machine diagram" />

## Three Levels of Plan Evolution

You might be thinking: "If the plan is just a function of state, how does it actually change over time?" Good question. It changes at three progressively deeper levels, and understanding them is the key to thinking in Smithers.

Picture a flipbook. You know the kind -- a stack of pages, each with a slightly different drawing, and when you flip through them fast enough you see animation. Every Smithers run is a flipbook. Each page is a **frame**: the complete execution plan at one moment in time.

Now: what can change between pages?

### Level 1: Cursor Movement

The simplest kind of animation. The drawing stays the same; only the highlight moves.

Inside a `<Sequence>`, the cursor advances from one task to the next as each completes. The plan itself doesn't change -- only which task is active.

```
Frame 1:  [analyze >] → [fix] → [review]
Frame 2:  [analyze +] → [fix >] → [review]
Frame 3:  [analyze +] → [fix +] → [review >]
```

Same three tasks on every page. The arrow just moves forward. Even the flattest workflow uses this -- it's the baseline.

### Level 2: Reactive Re-rendering

Now it gets interesting. What if you don't know the full plan at the start?

Every time a task finishes, Smithers re-renders the entire JSX tree with the updated state. This is not just moving a cursor. This is **drawing a new page with more tasks on it than the page before**.

```tsx
export default smithers((ctx) => {
  const analysis = ctx.outputMaybe(outputs.analysis, { nodeId: "analyze" });

  return (
    <Workflow name="adaptive">
      <Task id="analyze" output={outputs.analysis} agent={analyst}>
        Analyze the codebase.
      </Task>

      {/* These tasks don't exist until analysis completes */}
      {analysis?.issues.map((issue) => (
        <Task key={issue.id} id={`fix-${issue.id}`} output={outputs.fix} agent={coder}>
          {`Fix: ${issue.description}`}
        </Task>
      ))}
    </Workflow>
  );
});
```

**Frame 1**: Only `analyze` exists in the plan. The fix tasks are not hidden or waiting -- they literally do not exist yet.

**Frame 2**: `analyze` finished and found 3 issues. Now `fix-1`, `fix-2`, `fix-3` appear. The plan grew from 1 task to 4.

Stop and let that sink in. You never told Smithers "after step one, create three more steps." You wrote a function that takes state and returns a plan. When the state changed, the plan changed. The plan is a **derived value** -- a pure function of state. You never mutate it. You update state, and the plan follows.

This is the aha moment. If you have worked with React, it is the same insight: you don't manipulate the DOM, you describe what it should look like given the current data, and the framework figures out the diff. Smithers does that, except the "DOM" is an execution plan and the "data" lives in SQLite.

### Level 3: Hot Reload (`--hot`)

The first two levels change what's *on* the pages. Level 3 changes the *art style* -- the source code itself.

With the `--hot` flag, you can edit the workflow definition, prompts, or agent configurations while the run is in progress. Smithers picks up the changes on the next render cycle without restarting.

```bash
smithers up workflow.tsx --hot true
```

This means you can:
- Edit a prompt mid-run to steer an agent differently
- Add a new task to the plan while earlier tasks are still executing
- Adjust retry policies or timeout values live

Why does this work? Because it's still the same loop. Render state into a plan, execute, persist, render again. Level 1 changes the cursor. Level 2 changes the state. Level 3 changes the code. The mechanism doesn't care which one changed -- it just renders a new frame.

## Why One Direction Matters

You could build workflows with bidirectional data, callbacks, event buses, pub/sub channels. Plenty of orchestration frameworks do. So why does Smithers insist on one direction?

Because constraints buy you things.

### Easy to Write

LLMs already know React. A Smithers workflow is a function that takes state and returns JSX. No graph DSL, no edge definitions, no scheduler configuration. An LLM can generate correct workflows the same way it generates React components -- because that's exactly what they are.

### Easy to Read

Data flows one way: state down, events up. When you read a workflow, you see the full plan for any given state right there in the JSX. There is no action-at-a-distance. No callback registered elsewhere that secretly modifies the plan. If you want to know what the plan looks like when `analysis` has three issues, read the function with that state in your head. That's it.

### Easy to Debug

Every render produces a **frame** -- a snapshot of the complete plan at that moment. When something unexpected happens, the debugging process is mechanical:

1. Find the frame where the plan diverged from what you expected.
2. Look at the state that produced that frame.
3. The bug is in the gap between the state you see and the plan you expected.

Because data flows one way, the cause is always upstream of the symptom. You never have to wonder "what changed this?" and trace backwards through a tangle of event handlers. The state changed. The render function ran. The plan came out wrong. That narrows your search to one function.

### Time Travel

Since every frame is persisted, you can inspect the history of your workflow and choose the task attempt you want to restore. `revert` operates on JJ-backed attempt snapshots rather than arbitrary frame numbers.

```bash
smithers revert workflow.tsx --run-id run-123 --node-id review --attempt 1 --iteration 0
smithers up workflow.tsx --run-id run-123 --resume true
```

Why does time travel work so cleanly? Because the plan is a pure function of persisted state plus the restored workspace snapshot. Revert the state, re-render, and Smithers continues from that point. No stale callbacks, no orphaned listeners, no dangling references to tasks that no longer exist. Just state in, plan out.

## The Mental Model

Come back to the flipbook.

- Each **page** is a frame -- the complete execution plan at one point in time.
- **Flipping forward** happens automatically as tasks complete (Level 1).
- **Drawing new pages** happens when state changes cause the plan to evolve (Level 2).
- **Redrawing the art style** happens when you change the source code with `--hot` (Level 3).

But here's what makes the flipbook metaphor precise rather than just cute: in a real flipbook, every page is pre-drawn. In Smithers, each page is *computed* from the current state. You don't plan the whole animation upfront. You define how to draw one page given the current data, and the runtime flips through as many pages as the work requires.

That is unidirectional dataflow. One constraint. Everything else follows.

## Next Steps

- [Reactivity](/concepts/reactivity) -- Deep dive into the render-schedule-execute loop and React patterns.
- [Execution Model](/concepts/execution-model) -- How the engine turns frames into durable task execution.
- [Workflow State](/concepts/workflow-state) -- The ctx API and how state flows between tasks.

---

## Reactivity

> How Smithers uses React's component model to build execution plans that evolve over time.
> Source: https://smithers.sh/concepts/reactivity

What if your workflow could change shape while it was running?

Not "skip a step" — actually grow new tasks that didn't exist a moment ago, because a previous task just produced the information that makes them possible. That's what Smithers does, and it does it with a tool you already know: React.

But forget everything you know about React and the DOM. Think of React here as a compiler, not a UI library. Your JSX compiles into an execution plan. And because React re-renders when data changes, the plan re-compiles every time a task finishes.

## The Core Insight

Here is a workflow with two tasks, except the second one doesn't exist yet:

```tsx
export default smithers((ctx) => {
  const analysis = ctx.outputMaybe(outputs.analysis, { nodeId: "analyze" });

  return (
    <Workflow name="pipeline">
      <Task id="analyze" output={outputs.analysis} agent={analyst}>
        Analyze the codebase.
      </Task>

      {analysis ? (
        <Task id="fix" output={outputs.fix} agent={coder}>
          {`Fix: ${analysis.issues.join(", ")}`}
        </Task>
      ) : null}
    </Workflow>
  );
});
```

**Frame 1**: `analysis` is `undefined`. Only `analyze` is mounted. The plan has one task.

**Frame 2** (after `analyze` completes): `analysis` has a value. Both tasks are mounted. The plan grew.

Read that again. The second task doesn't just "wait" — it literally does not exist in the execution plan until the first task produces a result. The plan is not static. It unfolds over time as each render cycle reveals new tasks.

This is the key insight: your workflow is a function of its own outputs.

## The Render-Schedule-Execute Loop

Every Smithers run follows this cycle:

```
┌─────────────────────────────────────────────┐
│  1. RENDER   Build JSX tree with current ctx │
│  2. EXTRACT  Walk tree → TaskDescriptor[]    │
│  3. SCHEDULE Find ready tasks                │
│  4. EXECUTE  Run ready tasks as Effects      │
│  5. PERSIST  Write outputs to SQLite         │
│  6. DETECT   Did the mounted task set change?│
│              If yes → go to 1                │
│              If all done → finish             │
└─────────────────────────────────────────────┘
```

Step 6 is where the magic lives. After persisting outputs, Smithers re-renders the JSX tree. If `ctx.outputMaybe()` now returns a value it didn't before, new tasks may mount. The engine detects this by comparing the set of mounted task IDs between frames and continues the loop.

Why does this matter? Because tasks appear in the plan at the moment their preconditions are met — not a moment before. You never declare "task B depends on task A." You write a conditional, and the dependency emerges from the render.

## React as a Workflow Compiler

You might be wondering: why React? Why not just a function that returns a list of tasks?

Because React gives you a reconciler — a well-tested engine for turning a tree of declarations into a structured result. Smithers implements a **custom React reconciler** that produces an in-memory `HostElement` tree instead of DOM nodes. The reconciler is used purely for tree construction:

- `React.createElement()` calls build the component tree
- The reconciler resolves props, children, and conditional branches
- The result is walked to extract `TaskDescriptor` objects
- Those descriptors drive scheduling and execution

That machinery gives Smithers several properties for free. Let's walk through them.

### Conditional Mounting

Standard JSX conditions control which tasks exist in the plan:

```tsx
{analysis?.hasIssues ? (
  <Task id="fix" output={outputs.fix} agent={coder}>Fix the issues.</Task>
) : null}
```

This is not a "skip" — the task literally does not exist in the execution plan until the condition is met. No node in the graph. No placeholder. Nothing.

### Component Composition

Here's where it gets interesting. Large workflows decompose into reusable components — the same way large UIs do:

```tsx
function ReviewCycle({ ticket }: { ticket: Ticket }) {
  const ctx = useCtx();
  const review = ctx.latest("review", `${ticket.id}:review`);

  return (
    <Loop until={review?.approved} maxIterations={5}>
      <Sequence>
        <Implement ticket={ticket} />
        <Validate ticket={ticket} />
        <Review ticket={ticket} />
      </Sequence>
    </Loop>
  );
}

// Use it in the main workflow
<Workflow name="multi-ticket">
  {tickets.map((ticket) => (
    <ReviewCycle key={ticket.id} ticket={ticket} />
  ))}
</Workflow>
```

Each `<ReviewCycle>` is a self-contained workflow fragment with its own loop, its own state lookups, and its own conditional logic. This is standard React composition — but it's building an execution plan, not a UI.

### Dynamic Task Generation

Because JSX is just JavaScript, you can generate tasks from runtime data:

```tsx
// Generate tasks from runtime data
{repos.map((repo) => (
  <Task key={repo.id} id={`${repo.id}:analyze`} output={outputs.analysis} agent={analyst}>
    {`Analyze ${repo.name}`}
  </Task>
))}
```

The plan adapts to whatever data is available at render time. Ten repos? Ten tasks. A hundred? A hundred. You don't decide ahead of time.

## Custom Hooks

Smithers provides `useCtx()` — a React hook that returns the workflow context. If you've written custom hooks before, you already know what to do. Build on top of it to extract common patterns.

### Extracting Output Logic

```tsx
function useReviewState(ticketId: string) {
  const ctx = useCtx();
  const claudeReview = ctx.latest("review", `${ticketId}:review-claude`);
  const codexReview = ctx.latest("review", `${ticketId}:review-codex`);

  return {
    claudeReview,
    codexReview,
    allApproved: !!claudeReview?.approved && !!codexReview?.approved,
    issues: [
      ...(claudeReview?.issues ?? []),
      ...(codexReview?.issues ?? []),
    ],
  };
}

// Clean component code
function ReviewFix({ ticket }: { ticket: Ticket }) {
  const { allApproved, issues } = useReviewState(ticket.id);

  return (
    <Task
      id={`${ticket.id}:review-fix`}
      output={outputs.reviewFix}
      agent={codex}
      skipIf={allApproved || issues.length === 0}
    >
      <ReviewFixPrompt issues={issues} />
    </Task>
  );
}
```

Notice what happened: the messy "check two reviewers, merge their issues" logic moved into a hook. The component just asks: are we approved? What are the issues? The hook is testable, reusable, and keeps the workflow component clean.

### Encapsulating Iteration Patterns

```tsx
function useIterationFeedback(ticketId: string) {
  const ctx = useCtx();

  return {
    previousImplement: ctx.latest("implement", `${ticketId}:implement`),
    previousValidation: ctx.latest("validate", `${ticketId}:validate`),
    isFirstIteration: ctx.iteration === 0,
    iterationCount: ctx.iterationCount("implement", `${ticketId}:implement`),
  };
}
```

### Conditional Workflow Fragments

```tsx
function useFeatureFlags() {
  const ctx = useCtx();
  return {
    parallelReview: ctx.input.enableParallelReview !== false,
    maxReviewRounds: ctx.input.maxReviewRounds ?? 3,
    skipValidation: ctx.input.skipValidation === true,
  };
}

function Pipeline({ ticket }: { ticket: Ticket }) {
  const flags = useFeatureFlags();

  return (
    <Loop until={approved} maxIterations={flags.maxReviewRounds}>
      <Sequence>
        <Implement ticket={ticket} />
        {!flags.skipValidation && <Validate ticket={ticket} />}
        {flags.parallelReview ? (
          <ParallelReview ticket={ticket} />
        ) : (
          <SingleReview ticket={ticket} />
        )}
      </Sequence>
    </Loop>
  );
}
```

Feature flags in a workflow. No special DSL. Just props and conditionals.

## React Patterns That Work

Because Smithers uses a real React reconciler, many standard React patterns work as-is. If you've used them in a UI, you can use them in a workflow.

### Props and Children

```tsx
function AgentTask({ id, agent, children, ...taskProps }: AgentTaskProps) {
  return (
    <Task id={id} output={outputs.result} agent={agent} retries={2} timeoutMs={300000} {...taskProps}>
      {children}
    </Task>
  );
}
```

### Render Props

```tsx
function WithAnalysis({ children }: { children: (analysis: Analysis) => React.ReactNode }) {
  const ctx = useCtx();
  const analysis = ctx.outputMaybe(outputs.analysis, { nodeId: "analyze" });
  if (!analysis) return null;
  return <>{children(analysis)}</>;
}

// Usage
<WithAnalysis>
  {(analysis) => (
    <Task id="fix" output={outputs.fix} agent={coder}>
      {`Fix: ${analysis.summary}`}
    </Task>
  )}
</WithAnalysis>
```

### Higher-Order Components

```tsx
function withRetry<P>(Component: React.FC<P>, retries: number) {
  return function RetryWrapper(props: P) {
    // The HOC adds default retry behavior
    return <Component {...props} retries={retries} retryPolicy={{ backoff: "exponential" }} />;
  };
}

const ResilientImplement = withRetry(Implement, 3);
```

### React Context for Configuration

```tsx
const AgentContext = React.createContext<{ reviewer: Agent; coder: Agent }>(null!);

function useAgents() {
  return React.useContext(AgentContext);
}

// Provide agents at the workflow level
<AgentContext.Provider value={{ reviewer: claude, coder: codex }}>
  <Workflow name="configurable">
    <ReviewCycle ticket={ticket} />
  </Workflow>
</AgentContext.Provider>

// Consume inside components
function Review({ ticket }: { ticket: Ticket }) {
  const { reviewer } = useAgents();
  return (
    <Task id={`${ticket.id}:review`} output={outputs.review} agent={reviewer}>
      Review the implementation.
    </Task>
  );
}
```

## What Works Differently

Now for the part that trips people up. Smithers is still React, so standard hooks can work. But workflow durability does **not** come from React state. Durable workflow state still lives in SQLite outputs and comes back through `ctx`.

| React feature | Works in Smithers? | Why |
| --- | --- | --- |
| Component composition | Yes | Tree construction |
| `useContext` / custom context | Yes | Available during render |
| `useCtx()` and custom hooks | Yes | Read from workflow context |
| Conditional rendering | Yes | Controls plan evolution |
| `React.memo` | No effect | Each frame is fresh |
| `useState` | Yes, but process-local only | Useful for local render behavior, not for durable state |
| `useEffect` | Yes, but process-local only | Good for local setup; not a replacement for durable tasks or tool calls |
| `useRef` | Yes, but process-local only | Persists across live re-renders, resets on restart/resume |
| `useMemo` | Yes | Great for stable providers, clients, and derived values |

This is intentional. Workflow truth lives in SQLite, not in React component state. The `ctx` object is still the single durable source of truth during each render, because it is rebuilt from persisted outputs. If you find yourself reaching for `useState` to remember something the workflow must survive, the answer is almost certainly `ctx.outputMaybe()`, `ctx.latest()`, or another output table.

## Why This Matters

So why go through all this? Because the reactive model gives Smithers three properties that static DAG definitions cannot provide.

### 1. Plans That Adapt

The workflow plan is not fixed upfront. It evolves as tasks complete:

```tsx
// Phase 1: Only discovery runs
<Task id="discover" output={outputs.tickets} agent={planner}>
  Break this project into tickets.
</Task>

// Phase 2: After discovery, tasks appear for each ticket
{tickets?.map((ticket) => (
  <ReviewCycle key={ticket.id} ticket={ticket} />
))}
```

The number and shape of tasks depends on what the planner discovers. A static DAG would need to know the ticket count upfront. You don't.

### 2. Natural Data Dependencies

Instead of declaring edges between nodes in a graph DSL, data dependencies are expressed as normal JavaScript:

```tsx
const analysis = ctx.outputMaybe(outputs.analysis, { nodeId: "analyze" });

// This task can't mount until analysis exists — the dependency is implicit
{analysis ? <Task id="fix" ...>{`Fix: ${analysis.issues}`}</Task> : null}
```

The dependency graph emerges from the render — no explicit wiring needed. If you can write an `if` statement, you can express a dependency.

### 3. Reuse Through Composition

React's component model means workflow logic is reusable without special framework support:

```tsx
// A reusable review cycle — works for any ticket type
function ReviewCycle({ ticket, maxRounds = 3 }) { ... }

// A reusable approval gate — works for any action
function RequireApproval({ id, title, children }) { ... }

// Compose them freely
<Workflow name="release">
  {tickets.map((t) => <ReviewCycle ticket={t} />)}
  <RequireApproval id="ship" title="Ship to prod?">
    <Task id="deploy" agent={deployer}>Deploy.</Task>
  </RequireApproval>
</Workflow>
```

This is not a plugin system or a macro language. It's standard TypeScript components with standard composition rules. You already know how to do this.

## Next Steps

- [Workflows Overview](/concepts/workflows-overview) — The big picture of what workflows are and how they work.
- [Workflow State](/concepts/workflow-state) — The ctx API and how data flows between tasks.
- [Control Flow](/concepts/control-flow) — The four control-flow primitives that structure execution.

---

## Time Travel

> Snapshots, diffing, forking, and replay — navigating workflow history like a version control system for execution state.
> Source: https://smithers.sh/concepts/time-travel

When an AI workflow fails at step 47, you don't want to stare at logs and guess what went wrong. You want to rewind to step 46, see exactly what changed, and try again with different inputs. That's time travel.

Smithers captures a snapshot of the entire [workflow state](/concepts/workflow-state) at every frame commit. You can diff any two snapshots, fork a run from any point in its history, and replay from a checkpoint with modified inputs. Combined with jj (Jujutsu VCS) integration, you get versioned filesystem state alongside versioned workflow state.

## Snapshots

A snapshot is a frozen picture of everything that matters at a specific frame in a run:

- **Node states** — which tasks are pending, running, finished, or failed
- **Output rows** — the actual data each completed task produced
- **Loop (Ralph) state** — loop iteration counters and completion flags (`<Loop>` renders as a `<smithers:ralph>` host element internally)
- **Input data** — the original input row that started the run
- **VCS pointer** — the jj `change_id` at the time of capture (if the repo uses jj)
- **Workflow hash** — hash of the workflow definition at capture time (`null` if unavailable)
- **Content hash** — SHA-256 of the serialized state, so you can detect identical snapshots cheaply

Snapshots are captured automatically after every frame commit in the [engine loop](/concepts/execution-model). You never call a "checkpoint" function. The engine does it for you.

```
Frame 0 ──snapshot──> Frame 1 ──snapshot──> Frame 2 ──snapshot──> ...
```

Each snapshot is stored in `_smithers_snapshots` with a composite key of `(run_id, frame_no)`. The serialized JSON blobs are self-contained — you can reconstruct the full workflow state from any single snapshot without reading the event log.

### Loading the Latest Snapshot

To load the most recent snapshot for a run (regardless of frame number), use `loadLatestSnapshot`:

```typescript
import { loadLatestSnapshot } from "smithers-orchestrator";

const snapshot = await loadLatestSnapshot(adapter, runId);
// Returns the snapshot with the highest frame_no, or undefined if none exist.
```

This is what the engine uses internally when [resuming a suspended run](/concepts/suspend-and-resume) — it picks up from the last committed snapshot instead of replaying the event log.

### Why Not Just Use Events?

Events tell you *what happened*. Snapshots tell you *what the world looked like*. Replaying an event log to reconstruct state is expensive and error-prone. A snapshot gives you the answer in one read.

## Diffing

Given two snapshots, `diffSnapshots` computes a structured diff:

```typescript
const diff = diffSnapshots(snapshotA, snapshotB);
// diff.nodesAdded    — nodes present in B but not A
// diff.nodesRemoved  — nodes present in A but not B
// diff.nodesChanged  — nodes whose state or output changed
// diff.outputsAdded  — new output rows in B
// diff.outputsRemoved — output rows gone in B
// diff.outputsChanged — output rows with different values
// diff.ralphChanged   — loop state differences
// diff.inputChanged   — whether the input row changed
// diff.vcsPointerChanged — whether the VCS pointer changed
```

This is a pure function — no database access needed. You pass in two snapshot objects and get back a structured diff you can render in the TUI, dump as JSON, or use programmatically.

The CLI exposes this as `smithers diff`:

```bash
# Diff two frames in the same run
smithers diff myworkflow.tsx abc123:3 abc123:7

# Diff the latest frames of two different runs
smithers diff myworkflow.tsx abc123 def456
```

## Forking

Forking creates a new run that starts from the state of an existing run at a specific frame. Think `git branch` but for workflow execution.

```
Run A:  Frame 0 → Frame 1 → Frame 2 → Frame 3 (failed)
                              ↓
Run B:                    Frame 0 → Frame 1 → ... (forked from A at frame 2)
```

When you fork:

1. A new run is created with a fresh `runId`
2. The snapshot from the parent run at the specified frame is copied as the initial state
3. Optionally, specific nodes are reset to "pending" — they and their downstream dependents will re-execute
4. Optionally, the input is overridden with new values
5. Optionally, a `forkDescription` is attached for traceability
6. The parent-child relationship is recorded in `_smithers_branches`

Forking is cheap because it copies a single snapshot row, not the entire event history.

### Reset Nodes

The `resetNodes` parameter lets you selectively re-execute specific tasks. When a node is reset:

- Its state is set to `pending`
- Its output row is cleared
- Any downstream nodes that depend on it are also reset (transitively)

This is the key mechanism for "what-if" experiments. Change one task's behavior and see how it ripples through the rest of the workflow.

### Listing Branches

To list all forks that branched from a given run:

```typescript
import { listBranches } from "smithers-orchestrator";

const branches = await listBranches(adapter, parentRunId);
// Returns BranchInfo[] — one entry per child run forked from this parent.
```

Each `BranchInfo` contains the child `runId`, `parentRunId`, `parentFrameNo`, optional `branchLabel` and `forkDescription`, and a `createdAtMs` timestamp.

### Looking Up Branch Info

To check whether a run is itself a fork and retrieve its parent relationship:

```typescript
import { getBranchInfo } from "smithers-orchestrator";

const info = await getBranchInfo(adapter, childRunId);
// Returns BranchInfo | undefined — undefined if the run is not a fork.
```

## Replay

Replay combines forking with execution. It creates a forked run and immediately starts running it:

```bash
# Replay from frame 5 of a specific run
smithers replay workflow.tsx --run-id abc123 --frame 5

# Replay with a specific node reset
smithers replay workflow.tsx --run-id abc123 --frame 5 --node implement

# Replay with new input
smithers replay workflow.tsx --run-id abc123 --frame 5 --input '{"prompt":"Try a different approach"}'

# Replay with VCS state restored
smithers replay workflow.tsx --run-id abc123 --frame 5 --restore-vcs
```

With `--restore-vcs`, Smithers also restores the filesystem to the jj revision that was active at the source frame. This means the code that runs the workflow is the same code that was running when the snapshot was taken.

## VCS Integration

Every snapshot records the jj `change_id` and operation ID at capture time. This creates a parallel timeline: workflow state in [SQLite](https://sqlite.org/wal.html), filesystem state in jj.

The `_smithers_vcs_tags` table maps `(run_id, frame_no)` to VCS metadata. To look up a specific tag:

```typescript
import { loadVcsTag } from "smithers-orchestrator";

const tag = await loadVcsTag(adapter, runId, frameNo);
// Returns VcsTag | undefined — includes vcsType, vcsPointer, vcsRoot, jjOperationId.
```

When you replay with `--restore-vcs`, Smithers:

1. Looks up the VCS pointer for the source frame
2. Creates a jj workspace at that revision
3. Executes the workflow from the workspace

This means you can replay a workflow exactly as it ran — same code, same state, same inputs — even if the codebase has changed since then.

## Timeline Visualization

The timeline shows the complete execution history of a run and all its forks:

```bash
# View timeline for a single run
smithers timeline abc123

# View the full branch tree
smithers timeline abc123 --tree

# JSON output for programmatic use
smithers timeline abc123 --json
```

The `--tree` flag recursively includes all child runs (forks), building a tree of execution history. The TUI's run detail view shows this automatically.

## Database Tables

Time travel adds three tables:

| Table | Primary Key | Purpose |
|-------|-------------|---------|
| `_smithers_snapshots` | `(run_id, frame_no)` | Full state capture at each frame |
| `_smithers_branches` | `run_id` | Parent-child fork relationships |
| `_smithers_vcs_tags` | `(run_id, frame_no)` | jj revision metadata per snapshot |

The `_smithers_runs` table also gains three columns: `parent_run_id`, `parent_frame_no`, and `branch_label`, making fork relationships queryable directly from the runs table.

## Selecting a Specific Attempt

When you time-travel to a node, by default Smithers picks the most recent attempt. Pass an explicit `attempt` number to target a different one:

```bash
# Travel back to attempt 2 of the "implement" node
smithers travel workflow.tsx --run-id abc123 --node implement --attempt 2
```

In code, this is the `attempt` field on `TimeTravelOptions`. If the specified attempt does not exist, the operation fails with `success: false` and no changes are made.

## Run Reset

The `smithers reset` command resets an entire run back to its starting state without creating a fork. Unlike `smithers travel` (which targets a specific node), a run reset re-queues every node and clears all outputs:

```bash
# Reset a stalled or failed run to re-execute from scratch
smithers reset workflow.tsx --run-id abc123
```

This is a destructive operation on the run itself -- no child run is created. Use `smithers replay` when you want to preserve the original run history alongside the re-execution.

## Metrics

Time travel operations export four metrics:

| Metric | Type | Description |
|--------|------|-------------|
| `smithers.snapshots.captured` | counter | Total snapshots written to the database |
| `smithers.snapshot.duration_ms` | histogram | Time to serialize and write a single snapshot |
| `smithers.forks.created` | counter | Total fork operations completed |
| `smithers.replays.started` | counter | Total replay operations initiated |

These appear in Prometheus exports and OpenTelemetry traces alongside all other Smithers metrics.

## Reset Dependents Toggle

By default, when you time-travel to a specific node, every downstream node that ran after the target attempt is also reset. You can disable cascade reset with `--no-deps`:

```bash
# Reset only the "analyze" node, leave its dependents as-is
smithers travel workflow.tsx --run-id abc123 --node analyze --no-deps
```

In code, set `resetDependents: false` on `TimeTravelOptions`. This is useful when you want to re-run a single task without disturbing work that is downstream but was not actually affected by the target node's output.

When `resetDependents` is `true` (the default), Smithers identifies all nodes whose attempts started at or after the target attempt's start timestamp and resets them too.

## Frame History Truncation on Revert

When a time-travel operation completes, Smithers truncates the frame log to match. All frames with a `created_at_ms` after the target attempt's start timestamp are deleted from `_smithers_frames`. This keeps the frame history consistent with the reset node states -- if you render the workflow after time travel, the frame log reflects the point in time you reverted to.

## Snapshot Restoration on Resume

When a suspended run resumes (for example, after a [`<WaitForEvent>`](/components/wait-for-event) unblocks), the engine calls `restoreDurableStateFromSnapshot` before re-entering the render loop. It loads the most recent snapshot for the run, re-inserts the input row, and rebuilds node state from the snapshot data. This means a resumed run does not need to replay the entire event log -- it picks up from the last committed snapshot.

## Events

Three new event types track time travel operations:

- **`SnapshotCaptured`** — emitted after each automatic snapshot. Carries `runId`, `frameNo`, and `contentHash`.
- **`RunForked`** — emitted when a fork is created. Carries the parent run ID, parent frame, and optional branch label.
- **`ReplayStarted`** — emitted when a replay begins. Carries the source run ID and frame.

## Next Steps

- [Time Travel Quickstart](/guides/time-travel-quickstart) -- Walk through diffing, replaying, and restoring runs from the CLI.
- [Workflow State](/concepts/workflow-state) -- The state model snapshots serialize and restore.
- [Suspend and Resume](/concepts/suspend-and-resume) -- How resumed runs restore from the latest snapshot.
- [Runtime Revert](/runtime/revert) -- Runtime APIs behind revert and replay operations.
- [Debugging Guide](/guides/debugging) -- Use snapshot diffs and forks to investigate failures.

---

## Scorers

> How Smithers evaluates task outputs using built-in and custom scorers with live and batch scoring modes.
> Source: https://smithers.sh/concepts/evals

Smithers ships a scoring system that lets you attach evaluation functions to tasks. Scorers run after a task finishes and produce a numeric score between 0 and 1, an optional human-readable reason, and optional metadata.

Scores are persisted in SQLite alongside your run data so you can query, aggregate, and visualize quality over time.

## Core Concepts

### Scorer

A scorer is a named function that takes a `ScorerInput` and returns a `ScoreResult`:

```ts
import { createScorer } from "smithers-orchestrator/scorers";

const myScorer = createScorer({
  id: "length-check",
  name: "Length Check",
  description: "Checks output meets minimum length",
  score: async ({ output }) => {
    const text = String(output);
    const score = Math.min(text.length / 500, 1);
    return { score, reason: `Output is ${text.length} chars` };
  },
});
```

### ScoreResult

Every scorer returns a `ScoreResult`:

| Field    | Type                        | Description                              |
|----------|-----------------------------|------------------------------------------|
| `score`  | `number` (0-1)              | Normalized quality score                 |
| `reason` | `string?`                   | Human-readable explanation               |
| `meta`   | `Record<string, unknown>?`  | Arbitrary metadata for downstream use    |

### ScorerInput

The input passed to every scorer function:

| Field          | Type              | Description                                |
|----------------|-------------------|--------------------------------------------|
| `input`        | `unknown`         | The original task input/prompt             |
| `output`       | `unknown`         | The task's produced output                 |
| `groundTruth`  | `unknown?`        | Expected output for comparison             |
| `context`      | `unknown?`        | Additional context (e.g. retrieved docs)   |
| `latencyMs`    | `number?`         | How long the task took in milliseconds     |
| `outputSchema` | `ZodObject?`      | The Zod schema the output should match     |

## Attaching Scorers to Tasks

Pass a `scorers` map to any `<Task>` component:

```tsx
import { latencyScorer, schemaAdherenceScorer } from "smithers-orchestrator/scorers";

<Task
  id="analyze"
  agent={claude}
  output={outputs.analysis}
  scorers={{
    latency: { scorer: latencyScorer({ targetMs: 5000, maxMs: 30000 }) },
    schema: { scorer: schemaAdherenceScorer() },
  }}
>
  Analyze the codebase and produce a summary.
</Task>
```

Scorers fire asynchronously after the task finishes. They never block the workflow.

## Sampling

Not every run needs every scorer. Use sampling to control evaluation frequency:

```tsx
scorers={{
  relevancy: {
    scorer: relevancyScorer(judge),
    sampling: { type: "ratio", rate: 0.1 },  // 10% of runs
  },
  schema: {
    scorer: schemaAdherenceScorer(),
    sampling: { type: "all" },  // every run (default)
  },
}}
```

| Sampling Type | Behavior                        |
|---------------|---------------------------------|
| `all`         | Run on every task execution     |
| `ratio`       | Run with probability `rate`     |
| `none`        | Disabled (useful for toggling)  |

## Custom Scorers

### createScorer

Build a scorer from a plain configuration object:

```ts
import { createScorer } from "smithers-orchestrator/scorers";

const myScorer = createScorer({
  id: "word-count",
  name: "Word Count",
  description: "Scores based on output word count",
  score: async ({ output }) => {
    const words = String(output).split(/\s+/).length;
    return { score: Math.min(words / 200, 1), reason: `${words} words` };
  },
});
```

### llmJudge

Build an LLM-as-judge scorer that delegates evaluation to an AI agent. The judge receives a prompt constructed from `promptTemplate` and is expected to return JSON with `score` (0–1) and optional `reason`. If the response cannot be parsed, the scorer returns 0 with a diagnostic reason.

```ts
import { llmJudge } from "smithers-orchestrator/scorers";

const toneScorer = llmJudge({
  id: "professional-tone",
  name: "Professional Tone",
  description: "Evaluates if the output maintains a professional tone",
  judge,
  instructions: "You evaluate whether text maintains a professional, business-appropriate tone.",
  promptTemplate: ({ input, output }) =>
    `Rate the professionalism of this response (0-1 JSON).\n\nInput: ${String(input)}\n\nOutput: ${String(output)}`,
});
```

| Field             | Type                            | Description                                                    |
|-------------------|---------------------------------|----------------------------------------------------------------|
| `id`              | `string`                        | Unique scorer identifier                                       |
| `name`            | `string`                        | Human-readable name                                            |
| `description`     | `string`                        | What this scorer evaluates                                     |
| `judge`           | `AgentLike`                     | The agent that performs the evaluation                          |
| `instructions`    | `string`                        | System-level instructions prepended to the prompt              |
| `promptTemplate`  | `(input: ScorerInput) => string`| Builds the prompt from the scorer input                        |

## Built-in Scorers

Smithers includes five built-in scorers:

### Code-based (no LLM needed)

**`schemaAdherenceScorer()`** — Validates that the output conforms to the task's Zod `outputSchema`. Returns 1.0 if `safeParse` succeeds, 0.0 if it fails (with validation issues in the reason). If no `outputSchema` is set, returns 1.0 with a skip note.

**`latencyScorer({ targetMs, maxMs })`** — Scores based on task execution time. Returns 1.0 at or below `targetMs`, linearly interpolates to 0.0 at `maxMs`, and returns 0.0 above `maxMs`. If no latency data is available, returns 1.0 with a skip note.

### LLM-based (requires a judge agent)

All three LLM-based scorers accept an `AgentLike` as `judge`. They construct a prompt with evaluation criteria, call `judge.generate()`, and parse the JSON response.

**`relevancyScorer(judge)`** — Evaluates whether the output is relevant to and addresses the input prompt. Considers both direct answers and related context. Scores from 0.0 (completely irrelevant) to 1.0 (perfectly relevant).

**`toxicityScorer(judge)`** — Detects toxic, harmful, offensive, or inappropriate content. Checks for hate speech, harassment, threats, discriminatory language, explicit content, and dangerous instructions. The score represents the *level* of toxicity: 0.0 means clean, 1.0 means highly toxic.

**`faithfulnessScorer(judge)`** — Checks whether the output is faithful to the provided `context` without hallucinations. Every claim in the output should be supported by the context. Scores from 0.0 (entirely fabricated) to 1.0 (completely faithful). If no context is provided, evaluates internal consistency.

## Persistence

All scores are stored in the `_smithers_scorers` table:

| Column         | Type    | Description                           |
|----------------|---------|---------------------------------------|
| `id`           | TEXT    | Unique score row ID                   |
| `run_id`       | TEXT    | Parent run                            |
| `node_id`      | TEXT    | Task that was scored                  |
| `iteration`    | INTEGER | Task iteration                        |
| `attempt`      | INTEGER | Task attempt number                   |
| `scorer_id`    | TEXT    | Scorer identifier                     |
| `scorer_name`  | TEXT    | Human-readable scorer name            |
| `source`       | TEXT    | `live` or `batch`                     |
| `score`        | REAL    | The 0-1 score                         |
| `reason`       | TEXT    | Optional explanation                  |
| `meta_json`    | TEXT    | JSON metadata                         |
| `input_json`   | TEXT    | Serialized scorer input               |
| `output_json`  | TEXT    | Serialized task output                |
| `latency_ms`   | REAL    | Task execution latency                |
| `scored_at_ms` | INTEGER | When the score was computed           |
| `duration_ms`  | REAL    | How long the scorer itself took       |

## Execution Modes

### Async (live scoring)

When scorers are attached to a `<Task>`, they run via `runScorersAsync` — fire-and-forget execution that never blocks the workflow. All scorers run concurrently with unbounded concurrency. Errors are logged but do not fail the task.

### Batch (offline evaluation)

For testing and offline evaluation, call `runScorersBatch` directly. It runs all scorers, waits for completion, and returns a map of key to `ScoreResult | null`:

```ts
import { runScorersBatch } from "smithers-orchestrator/scorers";

const results = await runScorersBatch(
  { schema: { scorer: schemaAdherenceScorer() } },
  { runId: "test", nodeId: "analyze", iteration: 0, attempt: 0, input: "...", output: { summary: "..." } },
  adapter,
);
// { schema: { score: 1, reason: "Output matches schema" } }
```

Both modes persist results to the `_smithers_scorers` table with a `source` column of `"live"` or `"batch"`.

## Aggregation

Query aggregate statistics across runs:

```ts
import { aggregateScores } from "smithers-orchestrator/scorers";

const stats = await aggregateScores(adapter, { runId: "run-123" });
```

### Filter Options

| Filter     | Type     | Description                         |
|------------|----------|-------------------------------------|
| `runId`    | `string` | Filter to a specific run            |
| `nodeId`   | `string` | Filter to a specific task node      |
| `scorerId` | `string` | Filter to a specific scorer         |

All filters are optional and can be combined.

### Returned Statistics

Each entry in the returned array contains:

| Field        | Type     | Description                                      |
|--------------|----------|--------------------------------------------------|
| `scorerId`   | `string` | Scorer identifier                                |
| `scorerName` | `string` | Human-readable scorer name                       |
| `count`      | `number` | Total number of scores                           |
| `mean`       | `number` | Average score                                    |
| `min`        | `number` | Lowest score                                     |
| `max`        | `number` | Highest score                                    |
| `p50`        | `number` | Median score (50th percentile)                   |
| `stddev`     | `number` | Standard deviation (population)                  |

## Events

Three event types are emitted during the scorer lifecycle:

**`ScorerStarted`** — Emitted when a scorer begins evaluation.

| Field        | Type     |
|--------------|----------|
| `scorerId`   | `string` |
| `scorerName` | `string` |
| `nodeId`     | `string` |
| `runId`      | `string` |

**`ScorerFinished`** — Emitted when a scorer completes successfully. Includes the `score` value.

| Field        | Type     |
|--------------|----------|
| `scorerId`   | `string` |
| `scorerName` | `string` |
| `score`      | `number` |
| `nodeId`     | `string` |
| `runId`      | `string` |

**`ScorerFailed`** — Emitted when a scorer throws an error. Includes the `error`.

| Field        | Type      |
|--------------|-----------|
| `scorerId`   | `string`  |
| `scorerName` | `string`  |
| `error`      | `unknown` |
| `nodeId`     | `string`  |
| `runId`      | `string`  |

## Metrics

Smithers tracks four Effect metrics for scorer execution:

| Metric                          | Type      | Description                                  |
|---------------------------------|-----------|----------------------------------------------|
| `smithers.scorers.started`      | Counter   | Incremented when a scorer begins             |
| `smithers.scorers.finished`     | Counter   | Incremented when a scorer completes          |
| `smithers.scorers.failed`       | Counter   | Incremented when a scorer throws             |
| `smithers.scorer.duration_ms`   | Histogram | Scorer execution time (exponential buckets, ~10ms to ~80s) |

These metrics are available through the standard Effect metric system and can be exported via OTLP. See [Monitoring and Logs](/guides/monitoring-logs).

## CLI

View scores from the command line:

```bash
# Show all scores for a run
smithers scores <run_id>

# Show scores for a specific node
smithers scores <run_id> --node analyze
```

---

## Cross-Run Memory

> How Smithers persists facts, messages, and semantic recall across workflow runs.
> Source: https://smithers.sh/concepts/memory

Your workflow completes a task, produces useful output, and exits. Next time it runs, that knowledge is gone. The agent starts from scratch every time. Cross-run memory fixes this by giving workflows a persistent brain -- facts they can write and read back, messages they can recall, and semantic search over past outputs.

## Three Kinds of Memory

Smithers memory has three layers, each solving a different problem:

| Layer | What it stores | How you access it | When to use it |
|-------|---------------|-------------------|----------------|
| **Working Memory** | Key-value facts with optional TTL | `getFact` / `setFact` | Config, counters, last-known-good values |
| **Message History** | Ordered messages per thread | `saveMessage` / `listMessages` | Conversation logs, audit trails |
| **Semantic Recall** | Embedded text searchable by similarity | `remember` / `recall` | "Find past outputs similar to this query" |

Working memory is fast and exact -- you know the key, you get the value. Message history is ordered and complete -- you get the last N messages from a conversation. Semantic recall is fuzzy and ranked -- you describe what you want, and it finds the closest matches.

## Namespaces

Every piece of memory lives in a namespace. A namespace scopes memory so that different workflows, agents, or users do not collide.

```ts
import type { MemoryNamespace } from "smithers-orchestrator/memory";

const ns: MemoryNamespace = { kind: "workflow", id: "code-review" };
// Serializes to: "workflow:code-review"
```

Four namespace kinds exist:

- `workflow` -- scoped to a specific workflow definition
- `agent` -- scoped to a specific agent identity
- `user` -- scoped to an end user
- `global` -- shared across everything

The `namespaceToString()` helper produces the canonical string form (`"workflow:code-review"`). All database queries filter by this string, so memory in one namespace never leaks into another.

## Working Memory

Working memory stores structured facts as JSON. Each fact has a namespace, a string key, and a JSON value. Optionally, a TTL in milliseconds causes the fact to expire automatically.

```ts
import { createMemoryStore } from "smithers-orchestrator/memory";

const store = createMemoryStore(db);

// Write a fact
await store.setFact(ns, "last-reviewer", { name: "Alice", score: 0.95 });

// Read it back
const fact = await store.getFact(ns, "last-reviewer");
// { namespace: "workflow:code-review", key: "last-reviewer", valueJson: '{"name":"Alice","score":0.95}', ... }

// List all facts in a namespace
const facts = await store.listFacts(ns);

// Delete
await store.deleteFact(ns, "last-reviewer");
```

Facts are upserted -- writing the same key twice replaces the previous value and updates the timestamp.

## Message History

Message history records ordered messages in threads. A thread belongs to a namespace and can hold messages from any role (user, assistant, system, tool).

```ts
// Create a thread
const thread = await store.createThread(ns, "Review session #42");

// Save messages
await store.saveMessage({
  threadId: thread.threadId,
  role: "user",
  contentJson: JSON.stringify({ text: "Review this PR" }),
  runId: "run-123",
  nodeId: "review-task",
});

// List the last 20 messages
const messages = await store.listMessages(thread.threadId, 20);

// Count messages in a thread
const total = await store.countMessages(thread.threadId);

// Look up a thread by ID
const existing = await store.getThread(thread.threadId);

// Delete a thread and all its messages
await store.deleteThread(thread.threadId);
```

Threads are useful for building multi-turn conversations that persist across runs. A Ralph loop can write to the same thread each iteration, building up context over time.

## Semantic Recall

Semantic recall uses the existing RAG infrastructure (vector store + embedding model) to store and retrieve memory by meaning rather than by key.

```ts
import { createSemanticMemory } from "smithers-orchestrator/memory";
import { openai } from "@ai-sdk/openai";

const semantic = createSemanticMemory(
  vectorStore,
  openai.embedding("text-embedding-3-small"),
);

// Store a memory
await semantic.remember(ns, "The user prefers TypeScript over JavaScript");

// Recall by query
const results = await semantic.recall(ns, "What language does the user prefer?", {
  topK: 5,
  similarityThreshold: 0.7,
});
```

Under the hood, `remember` chunks the content, embeds it with the AI SDK's `embedMany()`, and upserts the vectors into the same `_smithers_vectors` table that RAG uses. `recall` embeds the query with `embed()`, searches the vector store filtered by namespace, and returns ranked results.

## Task Integration

The `memory` prop on `<Task>` connects memory to the execution flow. Before the agent runs, recalled context is prepended to the prompt. After the agent finishes, output is stored.

```tsx
<Task
  id="analyze"
  agent={reviewer}
  output={outputs.analysis}
  memory={{
    recall: { namespace: { kind: "workflow", id: "code-review" }, topK: 3 },
    remember: { namespace: { kind: "workflow", id: "code-review" }, key: "last-analysis" },
  }}
>
  Review this pull request
</Task>
```

- `memory.recall` -- before the agent call, query semantic memory and prepend the top results as context. Accepts an optional `query` to override the default prompt-based query.
- `memory.remember` -- after the agent call, store the output in both working memory (under the given key) and semantic memory
- `memory.threadId` -- optionally attach a message thread to the task, so conversation history persists across runs

## Loop Memory

Ralph (Loop) can use memory to carry context across iterations. When `recallPreviousRuns` is enabled, each iteration recalls what happened in previous iterations via semantic search.

```tsx
<Loop until={done}>
  <Task
    id="iterate"
    agent={agent}
    output={outputs.result}
    memory={{
      recall: { namespace: { kind: "workflow", id: "my-loop" }, topK: 5 },
      remember: { namespace: { kind: "workflow", id: "my-loop" } },
    }}
  >
    Improve the previous result
  </Task>
</Loop>
```

Each iteration writes its output to memory and reads back the most relevant past outputs on the next iteration.

## Processors

Memory processors run maintenance operations on stored data. Each processor implements the `MemoryProcessor` interface with both Promise and Effect variants.

- **TtlGarbageCollector** -- deletes expired facts based on their `ttlMs` field
- **TokenLimiter** -- compresses message history when it exceeds a token budget
- **Summarizer** -- uses an LLM to summarize old messages, replacing many messages with a single summary

```ts
import { TtlGarbageCollector, TokenLimiter, Summarizer } from "smithers-orchestrator/memory";

// Delete all expired facts
const gc = TtlGarbageCollector();
await gc.process(store);

// Limit message history to ~4000 tokens
const limiter = TokenLimiter(4000);
await limiter.process(store);

// Summarize old messages using an LLM agent
const summarizer = Summarizer(myAgent);
await summarizer.process(store);
```

Processors run on-demand or can be wired into the workflow lifecycle.

## Effect Service

The `MemoryService` Effect tag bundles working memory, semantic recall, and message history into a single service layer. This is the recommended way to use memory in Effect-based code.

```ts
import { MemoryService, createMemoryLayer } from "smithers-orchestrator/memory";
import { Effect } from "effect";

const layer = createMemoryLayer({ db, vectorStore, embeddingModel });

const program = Effect.gen(function* () {
  const memory = yield* MemoryService;
  yield* memory.setFact(ns, "key", { value: 42 });
  const results = yield* memory.recall(ns, "search query");
  return results;
});
```

## Storage

Memory uses three SQLite tables, all created automatically:

| Table | Purpose |
|-------|---------|
| `_smithers_memory_facts` | Working memory key-value pairs |
| `_smithers_memory_threads` | Message thread metadata |
| `_smithers_memory_messages` | Individual messages within threads |

Semantic recall reuses the existing `_smithers_vectors` table from the RAG module. No separate vector table is needed.

## Observability

Memory operations emit structured events and update Effect metrics:

- `MemoryFactSet` / `MemoryRecalled` / `MemoryMessageSaved` events
- `smithers.memory.fact_reads` / `smithers.memory.fact_writes` counters
- `smithers.memory.message_saves` counter
- `smithers.memory.recall_queries` counter
- `smithers.memory.recall_duration_ms` histogram

---

## RAG (Retrieval-Augmented Generation)

> How Smithers chunks documents, embeds them, stores vectors, and retrieves context at query time.
> Source: https://smithers.sh/concepts/rag

Your agent needs to answer questions about a codebase, a set of design docs, or a knowledge base. The model's training data does not cover your private documents. You could paste everything into the prompt, but that blows up the context window and costs a fortune. RAG solves this by fetching only the relevant pieces at query time.

## The Pipeline

RAG in Smithers is a four-step pipeline:

1. **Chunk** -- split documents into small, overlapping pieces
2. **Embed** -- convert each chunk into a vector using an embedding model
3. **Store** -- persist vectors in a SQLite table alongside the original text
4. **Retrieve** -- embed the query, find the closest vectors, return the matching chunks

```
Document ──▶ Chunker ──▶ Embedder ──▶ Vector Store
                                           │
Query ──▶ Embedder ──▶ Similarity Search ──┘──▶ Ranked Results
```

Each step is a plain function. You can use them individually or wire them together with `createRagPipeline`.

## Chunking Strategies

A document rarely fits in a single embedding. Chunking breaks it into pieces that are small enough to embed and specific enough to be useful when retrieved.

Smithers ships five strategies:

| Strategy | Splits on | Best for |
|-----------|-----------|----------|
| `recursive` | Paragraphs, then lines, then words, then characters | General text (default) |
| `character` | Fixed character count | Uniform chunk sizes |
| `sentence` | Sentence boundaries | Prose, articles |
| `markdown` | Headings and sections | Documentation, READMEs |
| `token` | Approximate token count (~4 chars/token) | Token-budget-aware splitting |

Every strategy accepts `size` (max characters per chunk) and `overlap` (characters shared between adjacent chunks). Overlap prevents information loss at chunk boundaries. The `character` strategy also accepts `separator` to override the default `"\n\n"` split boundary.

```ts
import { chunk, createDocument } from "smithers-orchestrator/rag";

const doc = createDocument("Your long document text here...");
const chunks = chunk(doc, { strategy: "recursive", size: 1000, overlap: 200 });

// Character strategy with a custom separator
const csvChunks = chunk(doc, { strategy: "character", size: 500, overlap: 50, separator: "\n" });
```

## Embedding

Smithers wraps the Vercel AI SDK's `embed()` and `embedMany()`. You bring any embedding model the AI SDK supports -- OpenAI, Google, Mistral, Cohere.

```ts
import { embedChunks, embedQuery } from "smithers-orchestrator/rag";
import { openai } from "@ai-sdk/openai";

const model = openai.embedding("text-embedding-3-small");
const embedded = await embedChunks(chunks, model);
const queryVector = await embedQuery("How does caching work?", model);
```

The embedder is intentionally thin. It bridges Smithers chunk types to the AI SDK and adds structured logging. No custom vector math.

## Vector Store

Vectors live in SQLite. No external database required. The `_smithers_vectors` table stores each chunk's text, embedding (as a Float32 BLOB), dimensions, and metadata (as JSON). Document metadata set via `createDocument` is propagated to every chunk and persisted in `metadata_json`, so it survives round-trips through the store. Queries do a full-table scan with JavaScript cosine similarity using the AI SDK's `cosineSimilarity()`.

This is fast enough for typical RAG workloads (hundreds to low thousands of chunks). If you outgrow it, swap in a different store implementation.

```ts
import { createSqliteVectorStore } from "smithers-orchestrator/rag";

const store = createSqliteVectorStore(workflow.db);
await store.upsert(embedded);
const results = await store.query(queryVector, { topK: 5 });
```

## The RAG Pipeline

`createRagPipeline` wires all four steps together:

```ts
import { createRagPipeline } from "smithers-orchestrator/rag";

const pipeline = createRagPipeline({
  vectorStore: store,
  embeddingModel: model,
  chunkOptions: { strategy: "markdown", size: 1000, overlap: 200 },
  topK: 10, // default topK for all retrieve calls (default 10)
  namespace: "docs", // optional: scope to a namespace
});

// Ingest
await pipeline.ingest([doc1, doc2]);
await pipeline.ingestFile("./docs/architecture.md");

// Retrieve — per-call topK overrides the pipeline default
const results = await pipeline.retrieve("How does the scheduler work?", { topK: 5 });
```

## The RAG Tool

Agents can search the knowledge base themselves. `createRagTool` exposes the pipeline as a tool:

```ts
import { createRagTool } from "smithers-orchestrator/rag";

const searchKnowledge = createRagTool(pipeline, {
  name: "search_knowledge",
  description: "Search the project knowledge base",
  defaultTopK: 5, // default results returned when agent omits topK
});
```

Hand this tool to any agent. When the agent calls it, Smithers embeds the query, searches the vector store, and returns the top results with relevance scores and metadata. The agent's `topK` parameter accepts 1-50; when omitted, `defaultTopK` (default 5) is used.

## Namespaces

A single vector store can hold multiple document collections. Pass a `namespace` to keep them separate:

```ts
const pipeline = createRagPipeline({
  vectorStore: store,
  embeddingModel: model,
  chunkOptions: { strategy: "recursive" },
  namespace: "api-docs",
});
```

Different namespaces share the same SQLite table but queries only search within their namespace.

## Document Format Detection

When you call `createDocument(content)`, Smithers auto-detects the format so the chunker can split intelligently:

| Detection rule | Assigned format |
|---------------|-----------------|
| Content starts with `{` or `[` and is valid JSON | `json` |
| Content starts with `<!` or `<html` | `html` |
| Content has a line starting with one to six `#` characters | `markdown` |
| Everything else | `text` |

You can override auto-detection by passing `format` explicitly:

```ts
const doc = createDocument(content, { format: "markdown" });
```

`loadDocument(path)` uses the file extension (`.md`, `.mdx`, `.html`, `.htm`, `.json`) as a hint before inspecting the content, so the chunker uses heading-aware splitting for Markdown files even if the heading markers are uncommon.

## Deleting Vectors

Remove specific chunks from the vector store by ID:

```ts
await store.delete(["chunk-id-1", "chunk-id-2"]);
```

Passing an empty array is a no-op. Use this to keep the store current when source documents are updated or removed.

## Counting Vectors

Check how many chunks are stored in a namespace:

```ts
const total = await store.count();            // default namespace
const apiDocs = await store.count("api-docs"); // specific namespace
```

Useful for verifying that ingestion completed and for monitoring store growth over time.

## Query Filters

`VectorQueryOptions` accepts an optional `filter` map that is passed through to the store implementation. The SQLite store does not apply metadata filters during the SQL query (it scores all rows and sorts), but custom store implementations can use `filter` to pre-select rows:

```ts
const results = await store.query(queryVector, {
  topK: 5,
  namespace: "api-docs",
  filter: { source: "architecture.md" },
});
```

When using the default SQLite store, include relevant metadata fields in the document itself and filter the returned results in application code.

## Effect Service Layer

For Effect-native workflows, a `RagService` Effect layer wraps the pipeline:

```ts
import { RagService, createRagServiceLayer, retrieve, ingest } from "smithers-orchestrator/rag";
import { Effect } from "effect";

const layer = createRagServiceLayer({
  vectorStore: store,
  embeddingModel: model,
  chunkOptions: { strategy: "markdown" },
});

const program = Effect.gen(function* () {
  yield* ingest([doc]);
  const results = yield* retrieve("How does auth work?", 5);
  return results;
}).pipe(Effect.provide(layer));
```

`ingest` and `retrieve` are convenience functions that pull `RagService` from Effect context automatically.

Lower-level Effect wrappers are also exported for direct use outside the service layer: `embedChunksEffect`, `embedQueryEffect`, `ingestEffect`, `retrieveEffect`, `upsertEffect`, and `queryEffect`. These give you Effect-typed versions of each pipeline step without requiring the full `RagService` context.

## Observability Metrics

RAG operations export four metrics:

| Metric | Type | Description |
|--------|------|-------------|
| `smithers.rag.ingest_total` | counter | Total documents ingested (incremented per `ingest` call by document count) |
| `smithers.rag.retrieve_total` | counter | Total retrieval queries executed |
| `smithers.rag.retrieve_duration_ms` | histogram | End-to-end retrieval latency (embed + query) |
| `smithers.rag.embed_duration_ms` | histogram | Time to embed a batch of chunks |

These integrate with the standard Smithers observability pipeline and appear in Prometheus exports and OpenTelemetry traces.

## CLI

Ingest files and query from the command line:

```bash
# Ingest a file
smithers rag ingest ./docs/api.md --workflow my-workflow.tsx

# Query the knowledge base
smithers rag query "How does authentication work?" --workflow my-workflow.tsx --top-k 5
```

---

## Voice

> How voice providers bring speech-to-text and text-to-speech into Smithers workflows.
> Source: https://smithers.sh/concepts/voice

Your workflow orchestrates code reviews, generates reports, analyzes data -- all in text. But some tasks start with an audio recording or need to produce spoken output. Maybe you have a meeting transcript to analyze, or you want your pipeline to read results aloud. That is what voice providers are for.

## What Is a Voice Provider?

A voice provider wraps a speech service behind a simple interface. It can do one or more of three things:

1. **Speak** -- convert text to audio (text-to-speech / TTS)
2. **Listen** -- convert audio to text (speech-to-text / STT)
3. **Realtime** -- bidirectional audio streaming over a WebSocket (speech-to-speech)

You pick the provider, configure it once, and hand it to your tasks. Smithers handles the wiring.

```ts
import { createAiSdkVoice } from "smithers-orchestrator/voice";
import { openai } from "@ai-sdk/openai";

const voice = createAiSdkVoice({
  speechModel: openai.speech("tts-1"),
  transcriptionModel: openai.transcription("whisper-1"),
});
```

That single object now speaks and listens. The AI SDK handles the actual API calls; smithers gives you the integration layer.

## The `<Voice>` Component

Wrap a subtree with `<Voice>` and every task inside inherits that voice provider:

```tsx
<Voice provider={voice} speaker="alloy">
  <Task id="transcribe" output={outputs.transcript} agent={myAgent}>
    Transcribe the uploaded audio file.
  </Task>
  <Task id="summarize" output={outputs.summary} agent={myAgent}>
    Summarize the transcript.
  </Task>
</Voice>
```

The `<Voice>` component does not execute anything itself. It annotates the tasks beneath it, the same way `<Worktree>` annotates tasks with a filesystem path or `<Parallel>` annotates them with concurrency limits.

Tasks inside a `<Voice>` scope receive `voice` and `voiceSpeaker` on their descriptors. The engine uses these to call `voice.listen()` when the task needs audio input or `voice.speak()` when it produces audio output.

## Batch vs Realtime

Two fundamentally different modes. Batch is what most people need.

**Batch**: send a blob of text, get a blob of audio back (or vice versa). One request, one response. The AI SDK's `experimental_generateSpeech` and `experimental_transcribe` handle this. It works with OpenAI, ElevenLabs, Deepgram, and others -- any provider the AI SDK supports.

**Realtime**: open a persistent WebSocket, stream audio in both directions simultaneously. OpenAI's Realtime API does this. Latency is low, but the protocol is more complex. Smithers provides `createOpenAIRealtimeVoice()` for this case because the AI SDK does not cover it.

Most workflows should start with batch. Reach for realtime only when you need live conversation.

## Composite Voice

What if you want Deepgram for transcription but ElevenLabs for speech? Composite voice mixes providers:

```ts
import { createCompositeVoice, createAiSdkVoice } from "smithers-orchestrator/voice";

const listener = createAiSdkVoice({
  transcriptionModel: deepgram.transcription("nova-3"),
});
const speaker = createAiSdkVoice({
  speechModel: elevenlabs.speech("eleven_multilingual_v2"),
});

const voice = createCompositeVoice({
  input: listener,
  output: speaker,
});
```

When a task calls `voice.listen()`, it routes to Deepgram. When it calls `voice.speak()`, it routes to ElevenLabs. If you also set a `realtime` provider, it takes priority for both operations.

## Effect Service Layer

For power users who build with Effect.ts directly, voice exposes an Effect service:

```ts
import { VoiceService, speak, listen } from "smithers-orchestrator/voice";
import { Effect } from "effect";

const program = Effect.gen(function* () {
  const transcript = yield* listen(audioStream);
  const audio = yield* speak(`Summary: ${transcript}`);
  return audio;
}).pipe(Effect.provideService(VoiceService, myVoice));
```

The `VoiceService` tag lets you inject a voice provider into any Effect pipeline. The `speak()` and `listen()` functions pull it from context automatically.

For scoped lifecycle management (automatic `connect()` and `close()`), use `createVoiceServiceLayer()`:

```ts
import { createVoiceServiceLayer, speak } from "smithers-orchestrator/voice";
import { Effect, Layer } from "effect";

const voiceLayer = createVoiceServiceLayer(realtimeVoice);

const program = Effect.gen(function* () {
  const audio = yield* speak("Hello from Effect");
  return audio;
}).pipe(Effect.provide(voiceLayer));
```

The layer handles calling `connect()` when the scope opens and `close()` when it closes.

## Listing Available Speakers

Every voice provider exposes `getSpeakers()`, which returns the list of voices that provider supports:

```ts
const speakers = await voice.getSpeakers();
// [{ voiceId: "alloy" }, { voiceId: "echo" }, ...]
```

For the OpenAI Realtime provider, this returns the eight built-in voices: `alloy`, `ash`, `ballad`, `coral`, `echo`, `sage`, `shimmer`, and `verse`.

For composite voice, `getSpeakers()` delegates to the realtime provider if one is set, otherwise to the output (TTS) provider. If neither is configured, it returns an empty array.

## Updating Voice Config at Runtime

You can update voice session parameters after initialization without reconnecting. Call `updateConfig` with any session-level settings the provider understands:

```ts
voice.updateConfig({
  voice: "shimmer",
  turn_detection: { type: "server_vad" },
});
```

For the OpenAI Realtime provider, `updateConfig` sends a `session.update` event over the existing WebSocket. Changes take effect for subsequent interactions in the same session. For composite voice, `updateConfig` delegates to the realtime provider.

## Manually Triggering a Realtime Response

In realtime (speech-to-speech) mode, OpenAI's server can detect speech automatically. But you can also trigger a response explicitly with `answer()`:

```ts
await voice.answer({
  modalities: ["audio"],
  instructions: "Summarize what was just said",
});
```

`answer()` sends a `response.create` event to the WebSocket. Any options you pass are forwarded as response properties. Call it when you want the model to respond immediately without waiting for voice activity detection.

## Overriding the WebSocket URL

The default WebSocket endpoint for OpenAI Realtime is `wss://api.openai.com/v1/realtime`. Override it with the `url` config option:

```ts
const voice = createOpenAIRealtimeVoice({
  url: "wss://my-proxy.example.com/realtime",
  model: "gpt-4o-mini-realtime-preview-2024-12-17",
});
```

The model name is appended as a query parameter (`?model=...`), so the full connection URL becomes `wss://my-proxy.example.com/realtime?model=gpt-4o-mini-realtime-preview-2024-12-17`. Use this for proxies, local development stubs, or alternative endpoints.

## Configuring the Transcription Model

By default, the OpenAI Realtime provider transcribes incoming audio with `whisper-1`. Change the transcription model with the `transcriber` config option:

```ts
const voice = createOpenAIRealtimeVoice({
  transcriber: "gpt-4o-transcribe",
});
```

The transcriber is sent to the server as part of the `session.update` call immediately after connection. It controls how the realtime API transcribes user audio for the `input_audio_transcription` session property.

## Audio Format Support

When calling `speak()`, you can request a specific audio format via the `format` option:

```ts
const audio = await voice.speak("Hello, world", { format: "opus" });
```

Supported formats:

| Format | Description |
| --- | --- |
| `mp3` | MPEG Layer 3 — widely compatible, lossy |
| `wav` | Waveform Audio — uncompressed, lossless |
| `pcm` | Raw PCM — no header, lowest overhead |
| `opus` | Opus codec — low latency, good for streaming |
| `flac` | Free Lossless Audio Codec |
| `aac` | Advanced Audio Coding — good compression |

Not every provider supports every format. If the provider does not support the requested format, it will use its default. The `AudioFormat` type is exported from `smithers-orchestrator/voice` for type-safe usage.

## Provider-Level Event Callbacks

Realtime voice providers emit events that you can subscribe to with `on()` and unsubscribe from with `off()`:

```ts
const handler = (data) => console.log(data);

voice.on("speaking", handler);   // audio output chunks
voice.on("writing", handler);    // text transcription chunks
voice.on("error", handler);      // provider errors
voice.on("speaker", handler);    // new audio output stream

voice.off("speaking", handler);  // remove a listener
```

| Event | Payload | When |
| --- | --- | --- |
| `speaking` | `{ audio, response_id }` | Each chunk of audio output from the model |
| `writing` | `{ text, role, response_id }` | Each chunk of text transcription |
| `error` | `{ message, code?, details? }` | A provider-level error occurred |
| `speaker` | `ReadableStream` | A new audio response stream was created |

These are provider-level events on the voice instance. They are separate from the Smithers event bus events (`VoiceStarted`, `VoiceFinished`, `VoiceError`) which track operation lifecycle at the workflow level.

## Default Speaker Selection

If you don't specify a `speaker` prop on `<Voice>` or a `speaker` option in the provider config, the default depends on the provider:

- **OpenAI Realtime**: defaults to `"alloy"`
- **AI SDK Voice**: no default — you must pass a speaker via `SpeakOptions` or the provider config, or the underlying model's default is used
- **Composite Voice**: delegates to whichever sub-provider handles the operation

You can override the speaker at three levels (highest priority first):

1. Per-call: `voice.speak("text", { speaker: "shimmer" })`
2. Per-component: `<Voice provider={voice} speaker="coral">`
3. Per-provider: `createOpenAIRealtimeVoice({ speaker: "echo" })`

## OpenAI Realtime: API Key and Environment Fallback

The OpenAI Realtime provider resolves API keys in this order:

1. The `apiKey` config option passed to `createOpenAIRealtimeVoice()`
2. The `OPENAI_API_KEY` environment variable

```ts
// Explicit key
const voice = createOpenAIRealtimeVoice({ apiKey: "sk-..." });

// Or rely on the environment variable — no config needed
const voice = createOpenAIRealtimeVoice();
// Uses process.env.OPENAI_API_KEY automatically
```

If neither is set, `connect()` throws an error.

## OpenAI Realtime: Model Override

Override the realtime model with the `model` config option:

```ts
const voice = createOpenAIRealtimeVoice({
  model: "gpt-4o-realtime-preview",
});
```

The default is `gpt-4o-mini-realtime-preview-2024-12-17`. The model name is appended as a query parameter to the WebSocket URL.

## OpenAI Realtime: Session Management

The OpenAI Realtime provider manages WebSocket session lifecycle automatically:

1. **`connect()`** opens a WebSocket, waits for the `session.created` event, then sends an initial `session.update` to configure the transcription model and default voice.
2. While connected, any calls to `send()`, `speak()`, `listen()`, or `answer()` use the active session.
3. **`close()`** tears down the connection, cleans up speaker streams, and releases resources.

Messages sent before the session is ready are automatically queued and flushed once the connection opens. You don't need to wait for `session.created` yourself — `connect()` returns only after the session is fully initialized.

```ts
const voice = createOpenAIRealtimeVoice({ speaker: "coral" });

await voice.connect();    // waits for session.created + session.update
await voice.send(audio);  // uses the active session
voice.close();            // tears down cleanly
```

If you call `connect()` while already connected, it returns immediately. Concurrent calls to `connect()` are deduplicated — only one connection attempt runs at a time.

## Events and Observability

Voice operations emit structured events:

- `VoiceStarted` -- a voice operation began (speak or listen)
- `VoiceFinished` -- it completed successfully
- `VoiceError` -- something went wrong

These flow through the same event bus as all other Smithers events. The `smithers.voice.operations_total` counter and `smithers.voice.duration_ms` histogram track volume and latency.

---

## OpenAPI Tools

> Turn any OpenAPI spec into tools your agents can call.
> Source: https://smithers.sh/concepts/openapi-tools

You have an internal API. It has an OpenAPI spec. Your agent needs to call it. You could hand-write a tool for every endpoint -- define the schema, build the URL, set the headers, parse the response. Or you could point Smithers at the spec and let it do that for you.

## The Problem

Every REST API endpoint you want an agent to use requires a tool. A tool needs three things: a Zod schema describing the parameters, a description the LLM can read, and an execute function that makes the HTTP request. For a single endpoint that is fine. For an API with forty endpoints, it is tedious and error-prone.

OpenAPI specs already contain all the information you need. The parameter types are there. The descriptions are there. The URL patterns and HTTP methods are there. The only question is how to convert that information into tools.

## The Solution

`createOpenApiTools` reads an OpenAPI 3.0+ spec and returns a `Record<string, Tool>` -- one tool per operation. Each tool has a Zod schema derived from the operation's parameters and request body, a description from the operation's summary, and an execute function that builds the correct HTTP request and returns the response.

```ts
import { createOpenApiTools } from "smithers-orchestrator";

const tools = await createOpenApiTools("https://api.example.com/openapi.json", {
  auth: { type: "bearer", token: process.env.API_TOKEN! },
});
```

That is the whole thing. `tools` is now a map of operation IDs to AI SDK tools. Hand them to an agent:

```tsx
const apiAgent = new Agent({
  model: anthropic("claude-sonnet-4-20250514"),
  tools,
});

<Task id="fetch-data" agent={apiAgent}>
  List the first 10 items from the inventory API.
</Task>
```

The agent sees tool descriptions like "List all pets" and parameters like `{ limit: z.number().optional() }`. It decides which endpoints to call, fills in the parameters, and gets back JSON responses. No glue code required.

## How It Works

The conversion follows four steps:

1. **Parse the spec.** Smithers loads the OpenAPI document (JSON object, URL, or file path), resolves `$ref` pointers, and extracts every operation.

2. **Convert schemas.** Each operation's path parameters, query parameters, header parameters, and request body are converted from JSON Schema into Zod schemas. Strings become `z.string()`, integers become `z.number().int()`, objects become `z.object()` with the correct shape. When a schema is too complex for clean conversion, Smithers falls back to `z.any()` with a description so the LLM still knows what to provide.

3. **Build the tool.** Each operation becomes an AI SDK `tool()` with the converted schema as `inputSchema`, the operation summary as `description`, and an execute function that assembles the HTTP request.

4. **Execute at runtime.** When an agent calls the tool, the execute function substitutes path parameters into the URL, appends query parameters, sets headers (including authentication), sends the request via `fetch`, and returns the response body.

## Authentication

Three authentication methods are supported:

```ts
// Bearer token
{ auth: { type: "bearer", token: "sk-..." } }

// Basic auth
{ auth: { type: "basic", username: "admin", password: "secret" } }

// API key (in header or query)
{ auth: { type: "apiKey", name: "X-API-Key", value: "key123", in: "header" } }
```

You can also pass arbitrary headers:

```ts
{ headers: { "X-Custom-Header": "value" } }
```

## Filtering Operations

Most APIs have endpoints you do not want an agent calling. Use `include` or `exclude` to control which operations become tools:

```ts
// Only these operations
const tools = await createOpenApiTools(spec, {
  include: ["listPets", "getPet"],
});

// Everything except these
const tools = await createOpenApiTools(spec, {
  exclude: ["deletePet", "deleteAllPets"],
});
```

## Single Operation

If you only need one tool from a spec, use `createOpenApiTool`:

```ts
import { createOpenApiTool } from "smithers-orchestrator";

const listPets = await createOpenApiTool(spec, "listPets", {
  baseUrl: "https://api.petstore.example.com",
});
```

## Observability

Every OpenAPI tool call emits an `OpenApiToolCalled` event and updates three metrics:

- `smithers.openapi.tool_calls` -- counter of total calls
- `smithers.openapi.tool_call_errors` -- counter of failed calls
- `smithers.openapi.tool_duration_ms` -- histogram of call durations

These integrate with the standard Smithers observability pipeline, so they appear in your logs, Prometheus exports, and OpenTelemetry traces alongside all other tool metrics.

## Synchronous Loading

Two variants exist. The async `createOpenApiTools` and `createOpenApiTool` work with any input -- objects, local files, or remote URLs (fetched via `fetch`). The sync variants `createOpenApiToolsSync` and `createOpenApiToolSync` skip the network fetch step, so they only work with spec objects or local file paths:

```ts
import { createOpenApiToolsSync } from "smithers-orchestrator";

// Works: spec object already in memory
const tools = createOpenApiToolsSync(specObject, options);

// Works: local file read synchronously
const tools = createOpenApiToolsSync("/path/to/openapi.json", options);

// Does not work: sync cannot fetch URLs
// const tools = createOpenApiToolsSync("https://api.example.com/openapi.json");
```

Use the sync variant when you are initializing tools at module load time and cannot await.

## Operation ID Fallback

If an OpenAPI operation does not have an `operationId`, Smithers generates one from the HTTP method and path. For example, `GET /pets/{petId}` becomes `get_pets_petId`. The generated ID strips braces and non-alphanumeric characters, joining segments with underscores.

You should still set explicit `operationId` values in your spec whenever possible -- they make tool names readable and stable. The fallback exists so that specs without IDs still produce usable tools.

## Loading a Spec via the Effect Layer

For Effect-native code, `loadSpecEffect` returns an `Effect.Effect<OpenApiSpec>` so you can compose spec loading with your existing Effect pipeline:

```ts
import { loadSpecEffect } from "smithers-orchestrator/openapi";
import { Effect } from "effect";

const program = Effect.gen(function* () {
  const spec = yield* loadSpecEffect("https://api.example.com/openapi.json");
  // spec is a fully parsed OpenApiSpec object
});
```

`loadSpecEffect` resolves URLs via `fetch`, reads local files synchronously, and parses both JSON and YAML. Pass a spec object and it returns immediately.

## Request Body Handling

When an operation has a `requestBody` with `application/json` content, Smithers adds a `body` parameter to the generated Zod schema. The agent fills `body` as a plain object; the execute function serializes it with `JSON.stringify` and sends it with `Content-Type: application/json`.

Required request bodies become required `body` parameters; optional request bodies become optional.

```ts
// Agent input for a POST /pets operation
{
  body: { name: "Fido", species: "dog" }
}
// → POST /pets with Content-Type: application/json and body {"name":"Fido","species":"dog"}
```

Parameters with `in: cookie` are silently skipped -- cookies are not exposed to agents.

## Non-JSON Response Handling

If the API returns a response with a non-JSON content type (anything that does not include `application/json`), the execute function returns the raw response text as a string. The agent receives that string as the tool result and can parse or summarize it as needed.

```ts
// JSON response → parsed JavaScript object returned to agent
// text/plain, text/html, etc. → raw string returned to agent
```

## Error Result Wrapping

When an HTTP call fails (network error, timeout, unexpected exception), the tool does not throw. Instead it returns a structured error object:

```ts
{
  error: true,
  message: "fetch failed: connection refused",
  status: "failed",
}
```

The agent sees this object as the tool result and can decide whether to retry, report the error, or continue with other tools. HTTP 4xx and 5xx responses are not automatically treated as errors -- the agent receives the parsed response body and can inspect the status itself.

## Schema Composition: allOf, anyOf, oneOf

Smithers converts OpenAPI composition keywords to Zod:

| Keyword | Zod equivalent |
|---------|---------------|
| `allOf` with one entry | the single entry schema |
| `allOf` with multiple entries | `z.intersection(schemaA, schemaB)` chained |
| `oneOf` | `z.union([...variants])` |
| `anyOf` | `z.union([...variants])` |

Circular `$ref` references are detected and replaced with `z.any()` annotated with the circular reference path.

## Nullable and Default Values

Two OpenAPI schema properties affect the generated Zod schema:

- **`nullable: true`** — wraps the schema with `.nullable()` so the agent can provide `null`
- **`default: <value>`** — adds `.default(<value>)` so missing inputs fall back to the spec default

These are applied after the base type, before the description:

```ts
// OpenAPI schema: { type: "string", nullable: true, default: "unknown" }
// Generated Zod:  z.string().default("unknown").nullable()
```

## When to Use OpenAPI Tools

Use them when you have an existing REST API with an OpenAPI spec and you want agents to interact with it. They are particularly good for:

- Internal APIs with dozens of endpoints
- Third-party APIs that publish OpenAPI specs
- Rapid prototyping where hand-writing tools is too slow

Do not use them when you need fine-grained control over how an API is called -- custom retry logic, request transformation, response filtering. In those cases, write a custom tool and call the API yourself.

---

## Planner Internals

> How Smithers compiles a builder graph into an executable durable plan.
> Source: https://smithers.sh/concepts/planner-internals

> **Warning:** This page describes internal architecture. You do not need it to write workflows, but it is useful when designing new primitives or debugging scheduler behavior.


You don't need this page to write workflows. But if you want to understand *why* they work -- why a step waits for its dependencies, why a crash doesn't lose progress, why the scheduler always knows what to run next -- this is the page.

Think of Smithers as a small compiler. Your JSX workflow definition is the source language. The output is not machine code but a durable execution plan -- a data structure the scheduler can walk, suspend, resume, and recover without ever re-reading your original definition.

The pipeline has six stages:

```txt
JSX Workflow
  ->
Workflow Graph
  ->
Normalized Plan
  ->
Durable Execution State
  ->
Ready Step Set
  ->
Effect Execution + Commit
```

We will walk through each one. By the end, you will know exactly what happens between the moment you write `<Workflow>` and the moment a step runs.

## Stage 1: Builder Capture

Start with the simplest interesting example:

```tsx
<Workflow name="bugfix">
  <Sequence>
    <Task id="analyze" output={outputs.analysis}>
      {{ summary: String(ctx.input.title) }}
    </Task>
    <Approval
      id="approve"
      output={outputs.approval}
      request={{ title: "Approve fix?" }}
      onDeny="fail"
    />
    <Task id="fix" output={outputs.fix}>
      {{ patch: "fix applied" }}
    </Task>
  </Sequence>
</Workflow>
```

What does Smithers do with this? It does *not* execute it. It captures a typed graph -- nodes with explicit kinds, stable ids, and typed handles. No side effects, no network calls, just structure.

Conceptually, the graph looks like this:

```ts
type WorkflowNode =
  | { kind: "step"; id: string; ... }
  | { kind: "sequence"; children: WorkflowNode[] }
  | { kind: "parallel"; children: WorkflowNode[]; maxConcurrency?: number }
  | { kind: "approval"; id: string; ... }
  | { kind: "match"; id: string; ... }
  | { kind: "loop"; id: string; ... };
```

Why does this matter? Because the graph is fully explicit *before execution begins*. The scheduler never has to interpret your JSX at runtime. It works from data.

If you have written a compiler, this is the parse tree. If you have not written a compiler, think of it as a blueprint: the house is not built yet, but every wall, door, and wire is accounted for.

## Stage 2: Normalization

A parse tree is not enough. You need to lower it into something the scheduler can work with mechanically. That is what normalization does.

Specifically, normalization:

- resolves step handles to stable internal ids
- flattens `needs` into dependency edges
- assigns concurrency groups
- attaches output model signatures
- attaches retry, timeout, cache, and approval policies
- derives branch and loop controller descriptors

After this pass, every executable node is schedulable without touching your builder code again. The original JSX has served its purpose. From here on, the scheduler operates on plain data.

If the analogy helps: this is the intermediate representation. The frontend is done.

## Compiled Step Descriptor

So what does a normalized step actually look like? Here is the shape:

```ts
type CompiledStep = {
  id: string;
  outputModel: string;
  needs: ReadonlyArray<{ name: string; stepId: string }>;
  retryPolicy?: unknown;
  timeout?: unknown;
  cachePolicy?: unknown;
  concurrencyGroup?: string;
  run: unknown;
};
```

Notice that `run` is a callback, but everything else is plain data. That distinction is not an accident. The planner and persistence layer must reason about the workflow even when a step is not currently running -- they need to inspect dependencies, check policies, and compute the ready set. They cannot do that if the interesting information is locked inside closures.

Data you can inspect. Callbacks you can only call.

## Stage 3: Durable Execution State

Here is where things get interesting. You have a normalized plan -- a static description of what *could* happen. Now Smithers pairs it with persisted execution state -- a record of what *has* happened.

That state includes:

- execution record
- node state
- completed outputs
- attempt history
- approval state
- loop state
- branch decisions
- cache hits or invalidations

Why keep all of this? Because the plan tells the scheduler what is possible. The state tells it what has already occurred. Together, they answer the only question the scheduler cares about: *what should run next?*

This is also what makes execution durable. If the process crashes after step two of five, the persisted state remembers those two completions. When the scheduler restarts, it rebuilds the ready set from the plan and the state, and picks up exactly where it left off. No guessing.

## Stage 4: Ready-Set Computation

Now the scheduler walks the normalized plan and computes the ready set -- the steps that can run right now.

A step is ready when *all* of the following hold:

- every entry in `needs` is completed
- any enclosing sequence has advanced to it
- any approval or branch controller has resolved in its favor
- any loop state allows the current iteration
- concurrency limits allow admission

That is the entire scheduling algorithm. No priority queues, no heuristics, no special cases. A step is either ready or it is not, and the answer comes from the graph edges and the current state.

Why is this simpler than you might expect? Because the hard work happened in normalization. All the implicit ordering from `<Sequence>`, all the conditional logic from `<Match>`, all the iteration bookkeeping from loops -- it was all lowered into explicit dependency edges and controller descriptors. The scheduler just reads them.

## Stage 5: Execution

Ready steps are executed by invoking their `run` callback with:

- validated input
- resolved dependency outputs
- execution metadata
- cancellation signal

That is all the planner provides. It does not know about LLM providers, HTTP clients, or database drivers. It knows how to supply inputs and interpret one of five outcomes: success, failure, retry, suspension, or cancellation.

This boundary is deliberate. The planner is a scheduler, not an application framework. Keeping it ignorant of provider-specific behavior means the same execution engine works regardless of what a step actually does.

## Stage 6: Commit and Transition

After a step produces a result, Smithers commits:

- the output model row
- attempt metadata
- lifecycle event records
- state transitions for downstream scheduling

This commit is atomic at the workflow level wherever possible. The goal is simple: after a crash, Smithers recovers from persisted state without guessing. Either a step's result was committed or it was not. There is no in-between to reconcile.

Once the commit lands, the scheduler recomputes the ready set, and the cycle continues until no steps remain.

## Debugging

When something goes wrong -- and something always goes wrong -- the most useful artifacts are:

- normalized plan snapshots
- node state tables
- dependency edges
- ready-set explanations
- step transition events

If those five things are visible, most workflow bugs reduce to: "this step expected that dependency to be complete, and it was not." The fix usually becomes obvious once you can see the edges.

## Next Steps

- [Runtime Events](/runtime/events) -- Planner-level transitions and lifecycle events.
- [runWorkflow](/runtime/run-workflow) -- Execution entry points for workflows.

---

## Best Practices

> Guidelines for designing effective Smithers workflows that produce reliable, high-quality results.
> Source: https://smithers.sh/guides/best-practices

Here is the uncomfortable truth about agent workflows: the orchestration code is easy. Getting agents to do what you actually want -- consistently, correctly, every time -- is the hard part.

These practices address that hard part.

## Give Agents Big, Coherent Tasks

Every [task](/components/task) boundary is a context boundary. When one task ends and another begins, the agent forgets everything from the previous task. It starts fresh with only the prompt you give it and the [structured output](/guides/structured-output) you pass in.

So do not split one logical operation into four tiny tasks. You are not decomposing work -- you are destroying context.

```tsx
// assuming outputs from createSmithers

// Avoid: splitting one logical operation into many tiny tasks
<Sequence>
  <Task id="read-files" output={outputs.files} agent={codeAgent}>Read the config files</Task>
  <Task id="find-bugs" output={outputs.bugs} agent={codeAgent}>Find bugs in the files</Task>
  <Task id="fix-bugs" output={outputs.fixes} agent={codeAgent}>Fix the bugs you found</Task>
  <Task id="write-fixes" output={outputs.written} agent={codeAgent}>Write the fixes to disk</Task>
</Sequence>

// Better: one coherent task with tools
<Task id="fix-config-bugs" output={outputs.result} agent={codeAgentWithTools}>
  {`Analyze the config files in ${ctx.input.configDir}, find any bugs,
fix them, and write the corrected files. Return a summary of changes.`}
</Task>
```

The second version gives the agent the full picture and lets it use [tools](/integrations/tools) (`read`, `edit`, `bash`) to accomplish everything in one pass. Only split into multiple tasks when the context genuinely changes, you need an explicit checkpoint, or a later task depends on the structured output of an earlier one.

## Use Measurable Stop Conditions for Loops

A [Loop](/components/loop) should stop based on a concrete, measurable signal -- not a subjective judgment. Ask yourself: can a machine evaluate this condition without interpretation?

Good stop conditions:

- Tests passing (boolean)
- Approval flag from a reviewer (boolean)
- Score exceeding a threshold (number comparison)
- All items in a list processed (array length check)

```tsx
// Good: concrete stop condition
// assuming outputs from createSmithers
<Loop
  until={ctx.outputMaybe(outputs.review, { nodeId: "review" })?.approved === true}
  maxIterations={5}
  onMaxReached="return-last"
>
  <Sequence>
    <Task id="implement" output={outputs.implement} agent={implementer}>
      {`Implement the feature. Previous feedback: ${
        ctx.outputMaybe(outputs.review, { nodeId: "review" })?.feedback ?? "None yet"
      }`}
    </Task>
    <Task id="review" output={outputs.review} agent={reviewer}>
      {`Review the implementation. Approve if it meets requirements.
Return JSON with approved (boolean) and feedback (string).`}
    </Task>
  </Sequence>
</Loop>
```

Always set `maxIterations`. A loop without a cap is a bug waiting to burn your API budget at 3 AM.

## Ask for Validation in Prompts

Do not assume the agent will run checks without being told. If you need tests, linting, or verification, say so explicitly in the prompt. Agents are literal-minded collaborators -- they do what you ask, not what you hope.

```tsx
// assuming outputs from createSmithers

// Vague -- agent might not verify
<Task id="implement" output={outputs.result} agent={codeAgent}>
  Fix the authentication bug.
</Task>

// Better -- explicit verification instructions
<Task id="implement" output={outputs.result} agent={codeAgentWithTools}>
  {`Fix the authentication bug in ${ctx.input.file}.

After making changes:
1. Run the test suite with \`bun test\`
2. Verify the specific failing test passes
3. Check that no other tests regressed

Return JSON with:
- summary (string): what you changed
- testsRun (number): how many tests ran
- testsPassed (number): how many passed
- filesChanged (string[]): list of modified files`}
</Task>
```

The second prompt leaves nothing to chance. The agent knows *what* to verify, *how* to verify it, and *what to report back*.

## Request Structured Reports

Design your [structured output](/guides/structured-output) schemas to capture the data you need for downstream tasks and human inspection. Every field should be something you can act on programmatically:

```tsx
const { Workflow, smithers, outputs } = createSmithers({
  analysis: z.object({
    summary: z.string(),
    issuesFound: z.number(),
    criticalIssues: z.number(),
    filesAnalyzed: z.array(z.string()),
    recommendations: z.array(z.object({
      file: z.string(),
      line: z.number(),
      severity: z.enum(["low", "medium", "high", "critical"]),
      description: z.string(),
      suggestedFix: z.string(),
    })),
  }),
});
```

With this schema you can conditionally branch based on `criticalIssues > 0`, generate summary reports from structured data, track metrics across runs, and feed specific recommendations into a fix task. Free-text summaries give you none of that.

## Use outputSchema for Type Safety

The `outputSchema` prop validates the agent's response against your [Zod](https://zod.dev) schema. It does three things:

1. **Validation** -- Responses are validated against the schema, with auto-retry on failure.
2. **Auto-injection** -- When children are JSX/[MDX](/guides/mdx-prompts) elements, `props.schema` is auto-injected with a JSON example.
3. **Cache key** -- The schema shape is part of the cache key, so schema changes invalidate stale caches.

```tsx
const reviewSchema = z.object({
  approved: z.boolean(),
  feedback: z.string().min(10),
  score: z.number().int().min(1).max(10),
});

// assuming outputs from createSmithers
<Task id="review" output={outputs.review} outputSchema={reviewSchema} agent={reviewer} deps={{ implement: outputs.implement }}>
  {(deps) => <ReviewPrompt code={deps.implement.code} />}
</Task>
```

If the agent returns `{ approved: "yes" }` instead of `{ approved: true }`, schema validation catches it and retries -- without burning a full task retry.

## Mark Side-Effect Tools

This is the single most important thing to get right with custom [tools](/integrations/tools). If your tool mutates external state — calling an API, writing to a database, sending an email, creating a PR — you **must** set `sideEffect: true` on the tool definition. If the mutation is not safe to repeat, also set `idempotent: false`.

Why this matters: Smithers retries failed tasks. Without `sideEffect: true`, Smithers treats your tool as a pure read and replays it without warning. That means duplicate orders, duplicate emails, double charges. The `sideEffect` flag is how Smithers knows to warn the agent on retry and provide an idempotency key for deduplication.

```tsx
import { defineTool } from "smithers-orchestrator";
import { z } from "zod";

// ✗ Dangerous: external mutation without sideEffect flag
const createTicket = defineTool({
  name: "jira.create_ticket",
  schema: z.object({ title: z.string(), body: z.string() }),
  async execute(args) {
    return await jira.createIssue(args);
  },
});

// ✓ Correct: marked as non-idempotent side effect
const createTicket = defineTool({
  name: "jira.create_ticket",
  schema: z.object({ title: z.string(), body: z.string() }),
  sideEffect: true,
  idempotent: false,
  async execute(args, ctx) {
    return await jira.createIssue({
      ...args,
      idempotencyKey: ctx.idempotencyKey,
    });
  },
});
```

Not everything is a side effect. File system changes inside the sandbox — writing files, editing code, running local shell commands — are **not** side effects. They are sandboxed, local, and tracked by git. You can undo them with `git reset`. The built-in `write`, `edit`, and `bash` tools do not carry the side effect flag for exactly this reason.

The rule: **if you cannot undo it with `git reset`, it is a side effect. Mark it.**

For the full reference on `sideEffect`, `idempotent`, and `ctx.idempotencyKey`, see [defineTool](/integrations/tools#side-effects-and-idempotency).

## Design for Resumability

Every long-running workflow will eventually crash. A network blip, a rate limit, a deploy that kills the process. If your workflow cannot [resume](/guides/resumability) from where it stopped, you are starting over from scratch every time.

Three rules:

- **Use deterministic task IDs.** No timestamps, no random strings, no array indices. If the ID changes between renders, Smithers treats it as a different task.
- **Make tasks idempotent where possible.** If a task writes files, design it so re-running produces the same result. For custom tools that call external APIs, [mark them as side effects](#mark-side-effect-tools) so Smithers handles retries safely.
- **Use `deps` for direct task handoff and `ctx.outputMaybe()` for orchestration decisions.** This keeps prompt wiring terse while preserving explicit [control-flow logic](/concepts/control-flow).

```tsx
// Good: deterministic, conditional, resumable
// assuming outputs from createSmithers
export default smithers((ctx) => (
  <Workflow name="robust">
    <Sequence>
      <Task id="analyze" output={outputs.analysis} agent={analyst}>
        Analyze the codebase.
      </Task>
      <Task id="fix" output={outputs.fix} agent={fixer} deps={{ analyze: outputs.analysis }}>
        {(deps) => `Fix: ${deps.analyze.summary}`}
      </Task>
      <Task id="report" output={outputs.report} deps={{ fix: outputs.fix }}>
        {(deps) => ({ summary: deps.fix.explanation, filesChanged: deps.fix.files })}
      </Task>
    </Sequence>
  </Workflow>
));
```

## Keep Prompts and Schemas Separate from Logic

As your workflow grows, you will want to iterate on prompts without touching orchestration logic, and swap agents without changing schemas. Separate your concerns:

- **`schemas.ts`** -- All Zod schemas in one file.
- **`agents.ts`** -- Agent configuration (model, system prompt, tools).
- **`prompts/`** -- [MDX prompt templates](/guides/mdx-prompts).
- **`workflow.tsx`** -- Composition only (how tasks connect, branch, and hand typed deps into steps).

When a prompt change requires editing `workflow.tsx`, something is wrong with your factoring.

## Set Reasonable Timeouts and Retry Limits

Every agent task should have a timeout. Agent calls can hang due to rate limits, network issues, or unexpectedly long generation. A task without a timeout is a task that might run forever.

```tsx
{/* assuming outputs from createSmithers */}
<Task
  id="analyze"
  output={outputs.analysis}
  agent={analyst}
  timeoutMs={120_000}   // 2 minutes
  retries={2}            // 3 total attempts
>
  Analyze the codebase.
</Task>
```

Rules of thumb:

- **Simple analysis tasks**: 30-60 seconds timeout, 1-2 retries.
- **Tool-using tasks** (read, edit, bash): 2-5 minutes timeout, 1-2 retries.
- **Large generation tasks**: 5-10 minutes timeout, 0-1 retries.
- **Non-critical tasks**: add `continueOnFail` so failures do not block the workflow.

## Use Caching for Iterative Development

You are going to iterate on prompts. A lot. Each iteration should not re-run every upstream task that already succeeded.

```tsx
<Workflow name="my-workflow" cache>
```

The cache key includes the prompt, model, tools, schema, and JJ pointer. Changing any of these invalidates the cache for that specific task. This means you can safely tweak a downstream prompt without re-running the expensive analysis step that feeds it.

Disable caching in production if you need fresh results on every run.

## Example: Complete Review Loop

Here is a full example combining these practices -- a [review loop](/guides/review-loop) with structured output, measurable stop conditions, explicit validation instructions, and reasonable error handling:

```tsx
import { createSmithers, Task, Sequence, Loop } from "smithers-orchestrator";
import { ToolLoopAgent as Agent } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
import { read, grep, bash, edit, write } from "smithers-orchestrator";
import { z } from "zod";

const { Workflow, smithers, outputs } = createSmithers({
  implement: z.object({
    summary: z.string(),
    filesChanged: z.array(z.string()),
    testsRun: z.number(),
    testsPassed: z.number(),
  }),
  review: z.object({
    approved: z.boolean(),
    feedback: z.string(),
    score: z.number().int().min(1).max(10),
  }),
  report: z.object({
    title: z.string(),
    body: z.string(),
    iterations: z.number(),
    finalScore: z.number(),
  }),
});

const implementer = new Agent({
  model: anthropic("claude-sonnet-4-20250514"),
  instructions: "You are a senior engineer. Implement changes, run tests, and return structured JSON.",
  tools: { read, grep, bash, edit, write },
});

const reviewer = new Agent({
  model: anthropic("claude-sonnet-4-20250514"),
  instructions: "You are a strict code reviewer. Return structured JSON with your assessment.",
  tools: { read, grep },
});

export default smithers((ctx) => {
  const review = ctx.outputMaybe(outputs.review, { nodeId: "review" });

  return (
    <Workflow name="review-loop">
      <Sequence>
        <Loop
          until={review?.approved === true}
          maxIterations={3}
          onMaxReached="return-last"
        >
          <Sequence>
            <Task
              id="implement"
              output={outputs.implement}
              agent={implementer}
              timeoutMs={300_000}
              retries={1}
            >
              {`Implement: ${ctx.input.description}

${review?.feedback ? `Previous review feedback:\n${review.feedback}` : ""}

After making changes:
1. Run \`bun test\` and report results
2. Verify your changes address the requirements

Return JSON with summary, filesChanged, testsRun, testsPassed.`}
            </Task>

            <Task
              id="review"
              output={outputs.review}
              agent={reviewer}
              timeoutMs={120_000}
              retries={1}
              deps={{ implement: outputs.implement }}
            >
              {(deps) => `Review the implementation.
Summary: ${deps.implement.summary}
Files changed: ${deps.implement.filesChanged.join(", ")}
Tests: ${deps.implement.testsPassed}/${deps.implement.testsRun} passed

Approve only if tests pass and the code is clean.
Return JSON with approved (boolean), feedback (string), score (1-10).`}
            </Task>
          </Sequence>
        </Loop>

        {review ? (
          <Task id="report" output={outputs.report}>
            {{
              title: `Review: ${ctx.input.description}`,
              body: review.feedback,
              iterations: ctx.iterationCount("review", "review") ?? 1,
              finalScore: review.score,
            }}
          </Task>
        ) : null}
      </Sequence>
    </Workflow>
  );
});
```

## Next Steps

- [Review Loop](/guides/review-loop) -- Production pattern for implement, validate, and review cycles.
- [Patterns](/guides/patterns) -- Project structure and naming conventions.
- [Structured Output](/guides/structured-output) -- Schema validation details.
- [Resumability](/guides/resumability) -- Deterministic IDs, safe retries, and resume behavior.
- [Error Handling](/guides/error-handling) -- Retries, timeouts, and fallback paths.

---

## Implement-Review Loop

> The recommended pattern for production workflows -- a loop that implements, validates, reviews, and fixes code until approved.
> Source: https://smithers.sh/guides/review-loop

You write code. You test it. You review it. You fix what the review found. Then you do it again.

That loop is the oldest pattern in software engineering. Smithers makes it the oldest pattern in *agent* engineering too.

## Why a loop?

Think about what happens without one. An agent writes code, declares victory, and moves on. Nobody checked if it compiles. Nobody checked if the logic is sound. You are trusting a single pass from a model that hallucinates sometimes.

Now think about what happens *with* one. The agent writes code, a separate agent runs the tests, two more agents review the result, and a fixer addresses every issue. Then the whole thing repeats until both reviewers sign off -- or you hit a safety cap.

That is the implement-review loop. Four steps, one [Loop](/components/loop), zero unsupervised merges.

## The four steps

Each iteration runs these in sequence:

1. **Implement** -- An agent writes code (preferably [Codex](https://platform.openai.com/docs)).
2. **Validate** -- A separate agent runs tests to verify correctness.
3. **Review** -- Two agents review in parallel ([Claude](https://docs.anthropic.com) + Codex).
4. **ReviewFix** -- An agent addresses every review issue.

The loop repeats until both reviewers approve or `maxIterations` is hit.

## Minimal Example

Before wiring up the loop, you need [structured output](/guides/structured-output) schemas. One per step:

```tsx
import { createSmithers, Task, Sequence, Parallel, Loop } from "smithers-orchestrator";
import { z } from "zod";

const { Workflow, smithers, tables, outputs } = createSmithers({
  implement: z.object({
    summary: z.string(),
    filesChanged: z.array(z.string()),
    allTestsPassing: z.boolean(),
  }),
  validate: z.object({
    allPassed: z.boolean(),
    failingSummary: z.string().nullable(),
  }),
  review: z.object({
    reviewer: z.string(),
    approved: z.boolean(),
    issues: z.array(z.object({
      severity: z.enum(["critical", "major", "minor", "nit"]),
      file: z.string(),
      description: z.string(),
    })),
    feedback: z.string(),
  }),
  reviewFix: z.object({
    fixesMade: z.array(z.object({ issue: z.string(), fix: z.string() })),
    allIssuesResolved: z.boolean(),
  }),
});
```

Four schemas, four steps. Each one captures exactly the data the next step needs. No more, no less.

## The ValidationLoop Component

Here is the core pattern. Read it top to bottom -- it is a [Loop](/components/loop) wrapping a [Sequence](/components/sequence) of four components:

```tsx
// components/ValidationLoop.tsx
import { Loop, Sequence } from "smithers-orchestrator";
import { Implement } from "./Implement";
import { Validate } from "./Validate";
import { Review } from "./Review";
import { ReviewFix } from "./ReviewFix";
import { useCtx, tables } from "../smithers";
import type { Ticket } from "./Discover.schema";
import type { ReviewOutput } from "./Review.schema";

const MAX_REVIEW_ROUNDS = 3;

export function ValidationLoop({ ticket }: { ticket: Ticket }) {
  const ctx = useCtx();
  const ticketId = ticket.id;

  const claudeReview = ctx.latest(tables.review, `${ticketId}:review-claude`) as ReviewOutput | undefined;
  const codexReview = ctx.latest(tables.review, `${ticketId}:review-codex`) as ReviewOutput | undefined;

  const allApproved = !!claudeReview?.approved && !!codexReview?.approved;

  return (
    <Loop
      id={`${ticketId}:impl-review-loop`}
      until={allApproved}
      maxIterations={MAX_REVIEW_ROUNDS}
      onMaxReached="return-last"
    >
      <Sequence>
        <Implement ticket={ticket} />
        <Validate ticket={ticket} />
        <Review ticket={ticket} />
        <ReviewFix ticket={ticket} />
      </Sequence>
    </Loop>
  );
}
```

Notice the stop condition: `until={allApproved}`. That is a boolean derived from two separate review outputs. Not a vague "looks good" -- a concrete, programmatic signal. The loop keeps going until both reviewers say yes, or three rounds pass, whichever comes first.

## Parallel Multi-Agent Review

Why two reviewers? Because they catch different things. Claude is strong on architecture and logic. Codex is strong on code correctness and edge cases. Running them in [parallel](/components/parallel) costs wall-clock time equal to the slower one -- not the sum.

Use `continueOnFail` so one reviewer timing out does not block the other:

```tsx
// components/Review.tsx
import { Parallel } from "smithers-orchestrator";
import { Task, useCtx, tables, outputs } from "../smithers";
import { claude, codex } from "../agents";
import ReviewPrompt from "./Review.mdx";
import type { Ticket } from "./Discover.schema";
import type { ValidateOutput } from "./Validate.schema";

export function Review({ ticket }: { ticket: Ticket }) {
  const ctx = useCtx();
  const ticketId = ticket.id;
  const latestValidate = ctx.latest(tables.validate, `${ticketId}:validate`) as ValidateOutput | undefined;

  // Skip review if tests fail -- send back to Implement
  if (!latestValidate?.allPassed) return null;

  const reviewProps = {
    ticketId,
    ticketTitle: ticket.title,
    ticketDescription: ticket.description,
  };

  return (
    <Parallel>
      <Task
        id={`${ticketId}:review-claude`}
        output={outputs.review}
        agent={claude}
        timeoutMs={15 * 60 * 1000}
        continueOnFail
      >
        <ReviewPrompt {...reviewProps} reviewer="claude" />
      </Task>
      <Task
        id={`${ticketId}:review-codex`}
        output={outputs.review}
        agent={codex}
        timeoutMs={15 * 60 * 1000}
        continueOnFail
      >
        <ReviewPrompt {...reviewProps} reviewer="codex" />
      </Task>
    </Parallel>
  );
}
```

There is a subtle but important detail at the top: if validation failed, the component returns `null`. No point reviewing code that does not pass tests. The loop skips review entirely and cycles back to Implement.

## Feeding Review Feedback Back to Implement

This is where the loop earns its keep. On the second (and third) iteration, the Implement component reads previous review issues and validation failures, then hands them to the agent:

```tsx
// components/Implement.tsx
import { Task, useCtx, tables, outputs } from "../smithers";
import { codex } from "../agents";
import ImplementPrompt from "./Implement.mdx";
import type { Ticket } from "./Discover.schema";

export function Implement({ ticket }: { ticket: Ticket }) {
  const ctx = useCtx();
  const ticketId = ticket.id;

  const latestImplement = ctx.latest(tables.implement, `${ticketId}:implement`);
  const latestValidate = ctx.latest(tables.validate, `${ticketId}:validate`);
  const claudeReview = ctx.latest(tables.review, `${ticketId}:review-claude`);
  const codexReview = ctx.latest(tables.review, `${ticketId}:review-codex`);

  const reviewIssues = [
    ...(claudeReview?.issues ?? []),
    ...(codexReview?.issues ?? []),
  ];

  return (
    <Task id={`${ticketId}:implement`} output={outputs.implement} agent={codex} timeoutMs={45 * 60 * 1000}>
      <ImplementPrompt
        ticketId={ticketId}
        ticketTitle={ticket.title}
        ticketDescription={ticket.description}
        previousImplementation={latestImplement ?? null}
        validationFeedback={latestValidate ?? null}
        reviewFixes={reviewIssues.length > 0 ? JSON.stringify(reviewIssues, null, 2) : null}
      />
    </Task>
  );
}
```

On the first iteration, `reviewIssues` is empty and `previousImplementation` is null. The agent starts fresh. On subsequent iterations, it gets a structured list of everything that went wrong. No ambiguity, no lost context.

## ReviewFix with skipIf

What if both reviewers approved? Then there is nothing to fix. The [Task](/components/task) `skipIf` prop handles this cleanly:

```tsx
// components/ReviewFix.tsx
import { Task, useCtx, tables, outputs } from "../smithers";
import { codex } from "../agents";
import ReviewFixPrompt from "./ReviewFix.mdx";
import type { Ticket } from "./Discover.schema";

export function ReviewFix({ ticket }: { ticket: Ticket }) {
  const ctx = useCtx();
  const ticketId = ticket.id;

  const claudeReview = ctx.latest(tables.review, `${ticketId}:review-claude`);
  const codexReview = ctx.latest(tables.review, `${ticketId}:review-codex`);

  const allApproved = !!claudeReview?.approved && !!codexReview?.approved;
  const allIssues = [...(claudeReview?.issues ?? []), ...(codexReview?.issues ?? [])];

  return (
    <Task
      id={`${ticketId}:review-fix`}
      output={outputs.reviewFix}
      agent={codex}
      skipIf={allApproved || allIssues.length === 0}
    >
      <ReviewFixPrompt
        ticketId={ticketId}
        issues={allIssues}
        feedback={[claudeReview?.feedback, codexReview?.feedback].filter(Boolean).join("\n\n")}
      />
    </Task>
  );
}
```

If there are no issues, the task is skipped. If both reviewers approved, the task is skipped. The loop's `until` condition sees `allApproved` and stops. No wasted compute.

## Why This Pattern Works

Five properties make this loop reliable in production:

- **Validation before review** -- No point reviewing code that does not compile or pass tests. If validation fails, the loop skips review and goes straight back to implement.
- **Parallel review** -- Two different models catch different kinds of issues. Claude is strong on architecture and logic; Codex is strong on code correctness and edge cases.
- **Structured issues** -- Review output uses a typed `issues` array with severity, file, and description. This lets ReviewFix address each issue systematically instead of parsing free-text feedback.
- **Bounded iterations** -- `maxIterations` prevents infinite loops. Use `onMaxReached: "return-last"` to accept the best effort after the cap.
- **Resumable** -- Every step persists to SQLite. If the workflow crashes mid-loop, it [resumes](/guides/resumability) from the last incomplete task. Not from the beginning. From right where it stopped.

## Next Steps

- [ReviewLoop Component](/components/review-loop) -- Component reference for the packaged loop pattern.
- [Loop Component](/components/loop) -- All Loop props and iteration semantics.
- [Parallel Component](/components/parallel) -- Parallel reviewer execution semantics.
- [Resumability](/guides/resumability) -- Recover mid-loop without rerunning completed steps.
- [Dynamic Tickets](/guides/dynamic-tickets) -- Generate tickets dynamically instead of hardcoding.
- [Model Selection](/guides/model-selection) -- Which models to use for each step.

---

## Workflow Patterns

> Recommended project structure, naming conventions, and organizational patterns for Smithers workflows.
> Source: https://smithers.sh/guides/patterns

A workflow with one task fits in a single file. A workflow with twenty tasks does not. These patterns show you how to organize a Smithers project so it stays readable as it grows.

## Project Structure

For small workflows (1-5 tasks), a single file is fine:

```
my-workflow/
  package.json
  tsconfig.json
  workflow.tsx          # Workflow definition
  agents.ts             # Agent configuration
  schemas.ts            # All Zod schemas in one place
  prompts/
    analyze.mdx         # MDX prompt templates
    review.mdx
  lib/
    helpers.ts           # Shared utility functions
```

When you cross roughly ten tasks, the single `workflow.tsx` file starts to hurt. Split tasks into component files:

```
my-workflow/
  package.json
  tsconfig.json
  bunfig.toml            # MDX preload config (if using MDX prompts)
  preload.ts
  workflow.tsx
  agents.ts
  schemas.ts
  components/
    Discover.tsx
    Implement.tsx
    Review.tsx
    Report.tsx
  prompts/
    discover.mdx
    implement.mdx
    review.mdx
  lib/
    render.ts            # MDX-to-text renderer
    helpers.ts
```

The key insight: `workflow.tsx` should contain only [control flow](/concepts/control-flow) -- how tasks connect, branch, and loop. The *what* lives in components and prompts. The *shape of data* lives in schemas. The *who does the work* lives in agents.

### Single-File Pattern

For prototyping or simple workflows, keep everything in one file. As soon as prompts become non-trivial, move them into `.mdx` files and leave `workflow.tsx` focused on composition.

```tsx
// workflow.tsx
import { createSmithers, Task } from "smithers-orchestrator";
import { ToolLoopAgent as Agent } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
import { z } from "zod";

const { Workflow, smithers, outputs } = createSmithers({
  analysis: z.object({ summary: z.string(), risk: z.enum(["low", "medium", "high"]) }),
  report: z.object({ title: z.string(), body: z.string() }),
});

const analyst = new Agent({
  model: anthropic("claude-sonnet-4-20250514"),
  instructions: "You are a code analyst. Return structured JSON.",
});

export default smithers((ctx) => (
  <Workflow name="quick-review">
    <Task id="analyze" output={outputs.analysis} agent={analyst}>
      {`Analyze: ${ctx.input.target}`}
    </Task>
    <Task id="report" output={outputs.report} deps={{ analyze: outputs.analysis }}>
      {(deps) => ({
        title: "Review Complete",
        body: deps.analyze.summary,
      })}
    </Task>
  </Workflow>
));
```

Sixty lines. Two tasks. You can read the entire workflow without scrolling. That is the point. Start here, and only split when the file forces you to.

## Schema Organization

Keep all [Zod](https://zod.dev) schemas in a centralized `schemas.ts` file. When someone new looks at your project, this is the first file they should read -- it is the complete [data model](/concepts/data-model) at a glance:

```ts
// schemas.ts
import { z } from "zod";

export const ticketSchema = z.object({
  id: z.string(),
  title: z.string(),
  description: z.string(),
  priority: z.enum(["low", "medium", "high"]),
});

export const schemas = {
  discover: z.object({
    tickets: z.array(ticketSchema).max(5),
  }),
  implement: z.object({
    summary: z.string(),
    filesChanged: z.array(z.string()),
    testsAdded: z.number(),
  }),
  review: z.object({
    approved: z.boolean(),
    feedback: z.string(),
    suggestions: z.array(z.string()),
  }),
  report: z.object({
    title: z.string(),
    body: z.string(),
    totalTickets: z.number(),
    totalApproved: z.number(),
  }),
};
```

Then your workflow file stays clean:

```tsx
// workflow.tsx
import { createSmithers, Task, Sequence } from "smithers-orchestrator";
import { schemas } from "./schemas";

const { Workflow, smithers, outputs } = createSmithers(schemas);
```

One import, one call, done. All the data-shape decisions live in one place.

## Task ID Naming Conventions

Task IDs must be unique within a workflow and deterministic across renders. If an ID changes between renders, Smithers treats it as a different task -- and that breaks [resumability](/guides/resumability).

**Simple tasks**: use a short, descriptive name.

```tsx
<Task id="analyze" output={outputs.analysis} agent={analyst}>
```

**Dynamic tasks** (generated from arrays): use a prefix with a stable identifier.

```tsx
{/* assuming outputs from createSmithers */}
{tickets.map((ticket) => (
  <Task key={ticket.id} id={`${ticket.id}:implement`} output={outputs.implement} agent={implementer}>
    {`Implement ticket ${ticket.id}: ${ticket.title}`}
  </Task>
))}
```

**Iteration-aware tasks** (inside [Loop](/components/loop)): the task ID stays the same across iterations. Smithers differentiates them by the `iteration` column.

```tsx
{/* assuming outputs from createSmithers */}
<Loop until={approved} maxIterations={3}>
  <Task id="review" output={outputs.review} agent={reviewer}>
    Review the implementation.
  </Task>
</Loop>
```

The naming convention: `{entity}:{action}` for dynamic tasks, plain `{action}` for single tasks.

```
analyze              -- single analysis task
ticket-42:implement  -- implementing ticket 42
ticket-42:review     -- reviewing ticket 42
report               -- final report
```

Why the colon? It gives you a visual namespace. You can scan a list of node IDs and instantly see which ticket each task belongs to.

## Agent Configuration

Centralize agent setup in `agents.ts`. This file answers one question: who does what? This example uses the [Vercel AI SDK](https://ai-sdk.dev) with [Anthropic Claude](https://docs.anthropic.com) models.

```ts
// agents.ts
import { ToolLoopAgent as Agent } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
import { read, grep, bash, edit, write } from "smithers-orchestrator";

const MODEL = process.env.CLAUDE_MODEL ?? "claude-sonnet-4-20250514";

export const analyst = new Agent({
  model: anthropic(MODEL),
  instructions: "You are a senior code analyst. Return structured JSON.",
});

export const implementer = new Agent({
  model: anthropic(MODEL),
  instructions: "You are a senior engineer. Implement changes and return structured JSON.",
  tools: { read, grep, bash, edit, write },
});

export const reviewer = new Agent({
  model: anthropic(MODEL),
  instructions: "You are a strict code reviewer. Return structured JSON with approval status.",
  tools: { read, grep },
});
```

Three agents, clearly named, with distinct tool sets. The analyst does not get `bash`. The reviewer does not get `edit`. Least privilege, enforced by configuration.

## MDX Prompt Templates

For prompts longer than a couple of lines, use [MDX prompts](/guides/mdx-prompts). This keeps your JSX clean and lets you compose prompts with variables:

```mdx
{/* prompts/review.mdx */}
Review the following implementation:

**Ticket**: {props.ticket.title}
**Description**: {props.ticket.description}

**Changes made**:
{props.summary}

**Files changed**:
{props.files.map(f => `- ${f}`).join("\n")}

Return JSON with:
- approved (boolean)
- feedback (string)
- suggestions (string[])
```

Enable MDX imports in Bun:

```toml
# bunfig.toml
preload = ["./preload.ts"]
```

```ts
// preload.ts
import { plugin, type BunPlugin } from "bun";
import mdx from "@mdx-js/esbuild";

plugin(mdx() as unknown as BunPlugin);
```

Use it directly in your component:

```tsx
// components/Review.tsx
import { Task } from "smithers-orchestrator";
import { reviewer } from "../agents";
import { outputs } from "../schemas"; // assuming outputs from createSmithers
import ReviewPrompt from "../prompts/review.mdx";

export function Review({ ticket, summary, files }: {
  ticket: { title: string; description: string };
  summary: string;
  files: string[];
}) {
  return (
    <Task id={`${ticket.title}:review`} output={outputs.review} agent={reviewer}>
      <ReviewPrompt ticket={ticket} summary={summary} files={files} />
    </Task>
  );
}
```

The component file is pure wiring. The prompt file is pure language. Neither contaminates the other.

## Output Access Patterns

There are two ways to read a previous task's output, and they serve different purposes.

Use `deps` for straightforward task-to-task handoff -- "this task needs that task's result":

```tsx
// assuming outputs from createSmithers
export default smithers((ctx) => (
  <Workflow name="example">
    <Sequence>
      <Task id="analyze" output={outputs.analysis} agent={analyst}>
        {`Analyze: ${ctx.input.description}`}
      </Task>

      <Task id="report" output={outputs.report} deps={{ analyze: outputs.analysis }}>
        {(deps) => ({ summary: deps.analyze.summary, risk: deps.analyze.risk })}
      </Task>
    </Sequence>
  </Workflow>
));
```

Use `ctx.outputMaybe()` when the *[control flow](/concepts/control-flow) itself* depends on the answer -- "should this task even exist?":

```tsx
const analysis = ctx.outputMaybe(outputs.analysis, { nodeId: "analyze" });

return analysis?.risk === "high" ? (
  <Task id="escalate" output={outputs.escalation}>...</Task>
) : null;
```

The distinction matters. `deps` is about data flow inside a prompt. `ctx.outputMaybe()` is about [control flow](/concepts/control-flow) in your JSX tree.

## Environment-Based Configuration

Use environment variables for settings that change between development and production, especially [model selection](/guides/model-selection) and [CLI agents](/integrations/cli-agents):

```ts
// agents.ts
const MODEL = process.env.CLAUDE_MODEL ?? "claude-sonnet-4-20250514";
const USE_CLI = process.env.USE_CLI_AGENTS === "1";
```

```bash
# Development
CLAUDE_MODEL=claude-sonnet-4-20250514 bun run workflow.tsx

# Production (use a more capable model)
CLAUDE_MODEL=claude-opus-4-6 bun run workflow.tsx
```

## Next Steps

- [Tutorial](/guides/tutorial-workflow) -- End-to-end tutorial using these patterns.
- [Project Structure](/guides/project-structure) -- Compare with the dedicated repository layout guide.
- [MDX Prompts](/guides/mdx-prompts) -- Move long prompts into reusable templates.
- [Best Practices](/guides/best-practices) -- Higher-level guidelines for effective workflows.

---

## Model Selection

> Which AI models to use for each step in a Smithers workflow, and how to configure CLI vs API agents.
> Source: https://smithers.sh/guides/model-selection

Not all tasks are created equal. An implementation task needs a model that writes correct code. A review task needs a model that reasons about architecture. A simple file-reading task needs a model that is fast and cheap.

Choosing the right model for each task is the difference between a workflow that works and one that burns money on overkill -- or worse, fails because you used a cheap model for a hard job.

## Recommended Models

### Codex (gpt-5.3-codex) -- Implementation

Codex is the strongest model for writing and modifying code. Use it for:

- Implementing features
- Fixing bugs
- Running and interpreting tests
- Refactoring code
- Fixing review issues

**Reasoning effort**: Set to `high` by default. Use `xhigh` for especially complex tasks -- architectural refactors, multi-file changes with tricky dependencies.

### Claude Opus (claude-opus-4-6) -- Planning and Review

Claude Opus is the strongest model for reasoning about architecture and evaluating code quality. Use it for:

- Research and codebase exploration
- Planning implementation steps
- Code review
- Report generation
- Orchestration logic and tool calling

### Claude Sonnet (claude-sonnet-4-5-20250929) -- Simple Tasks

Sonnet is fast, cheap, and good enough for straightforward work. Use it for:

- Simple tool calling (reading files, running commands)
- Lightweight reviews where deep reasoning is not needed
- Report aggregation from structured data
- Tasks where a more expensive model would be wasteful

## Summary Table

| Task Type | Recommended Model | Why |
| --- | --- | --- |
| Implementing code | Codex | Strongest at code generation |
| Reviewing code | Claude Opus + Codex (parallel) | Two models catch more issues |
| Research and planning | Claude Opus | Strongest at architectural reasoning |
| Running tests / validation | Codex | Good at interpreting build output |
| Simple tool calls | Claude Sonnet | Fast, cheap, sufficient |
| Report generation | Claude Sonnet or Opus | Depends on complexity |
| Ticket discovery | Codex or Claude Opus | Both work well for codebase analysis |

The parallel review row deserves special attention. Running two different models on the same review catches more bugs than running one model twice. They have different blind spots.

## CLI Agents vs AI SDK Agents

Smithers supports two ways to run each model. The choice depends on how you pay.

### CLI Agents (subscription-based)

Use `ClaudeCodeAgent`, `CodexAgent`, and `KimiAgent` when you have a subscription to the respective service. The agent runs as a subprocess using the CLI binary, which provides its native tool ecosystem -- file editing, shell access, and everything else the CLI supports.

```ts
import { ClaudeCodeAgent, CodexAgent, KimiAgent } from "smithers-orchestrator";

const claude = new ClaudeCodeAgent({
  model: "claude-opus-4-6",
  systemPrompt: SYSTEM_PROMPT,
  dangerouslySkipPermissions: true,
  timeoutMs: 30 * 60 * 1000,
});

const codex = new CodexAgent({
  model: "gpt-5.3-codex",
  systemPrompt: SYSTEM_PROMPT,
  yolo: true,
  config: { model_reasoning_effort: "high" },
  timeoutMs: 30 * 60 * 1000,
});

const kimi = new KimiAgent({
  model: "kimi-latest",
  systemPrompt: SYSTEM_PROMPT,
  thinking: true,
  timeoutMs: 30 * 60 * 1000,
});
```

### AI SDK Agents (API billing)

Use `AnthropicAgent` and `OpenAIAgent` when you want per-token billing instead of a subscription, or when you want sandboxed tools from Smithers:

```ts
import { stepCountIs } from "ai";
import { AnthropicAgent, OpenAIAgent, tools } from "smithers-orchestrator";

const claude = new AnthropicAgent({
  model: "claude-opus-4-6",
  tools,
  instructions: SYSTEM_PROMPT,
  stopWhen: stepCountIs(100),
});

const codex = new OpenAIAgent({
  model: "gpt-5.3-codex",
  tools,
  instructions: SYSTEM_PROMPT,
  stopWhen: stepCountIs(100),
});
```

## Dual-Agent Setup

In practice, you want the flexibility to switch between CLI and API agents without rewriting your workflow. Define both and let an environment variable decide:

```ts
// agents.ts
import { stepCountIs, type ToolSet } from "ai";
import {
  AnthropicAgent,
  ClaudeCodeAgent,
  CodexAgent,
  KimiAgent,
  OpenAIAgent,
} from "smithers-orchestrator";
import { tools as smithersTools } from "smithers-orchestrator";
import { SYSTEM_PROMPT } from "./system-prompt";

const tools = smithersTools as ToolSet;
const USE_CLI = process.env.USE_CLI_AGENTS !== "0" && process.env.USE_CLI_AGENTS !== "false";
const UNSAFE = process.env.SMITHERS_UNSAFE === "1";

// --- Codex ---
const CODEX_MODEL = process.env.CODEX_MODEL ?? "gpt-5.3-codex";

const codexApi = new OpenAIAgent({
  model: CODEX_MODEL,
  tools,
  instructions: SYSTEM_PROMPT,
  stopWhen: stepCountIs(100),
  maxOutputTokens: 8192,
});

const codexCli = new CodexAgent({
  model: CODEX_MODEL,
  systemPrompt: SYSTEM_PROMPT,
  yolo: UNSAFE,
  config: { model_reasoning_effort: "high" },
  timeoutMs: 30 * 60 * 1000,
});

export const codex = USE_CLI ? codexCli : codexApi;

// --- Claude ---
const CLAUDE_MODEL = process.env.CLAUDE_MODEL ?? "claude-opus-4-6";

const claudeApi = new AnthropicAgent({
  model: CLAUDE_MODEL,
  tools,
  instructions: SYSTEM_PROMPT,
  stopWhen: stepCountIs(100),
  maxOutputTokens: 8192,
});

const claudeCli = new ClaudeCodeAgent({
  model: CLAUDE_MODEL,
  systemPrompt: SYSTEM_PROMPT,
  dangerouslySkipPermissions: UNSAFE,
  timeoutMs: 30 * 60 * 1000,
});

export const claude = USE_CLI ? claudeCli : claudeApi;

// --- Kimi ---
const KIMI_MODEL = process.env.KIMI_MODEL ?? "kimi-latest";

const kimiCli = new KimiAgent({
  model: KIMI_MODEL,
  systemPrompt: SYSTEM_PROMPT,
  thinking: true,
  timeoutMs: 30 * 60 * 1000,
});

export const kimi = kimiCli; // Kimi is CLI-only
```

Switch at launch time:

```bash
# Use CLI agents (subscription)
USE_CLI_AGENTS=1 SMITHERS_UNSAFE=1 bunx smithers up workflow.tsx

# Use API agents
USE_CLI_AGENTS=0 bunx smithers up workflow.tsx
```

Your workflow code never changes. Only the agent wiring does.

## Assigning Models to Steps

In a typical workflow with a [review loop](/guides/review-loop), assign models by what they are good at:

| Step | Agent | Reasoning |
| --- | --- | --- |
| Discover | `codex` | Good at codebase analysis and structured output |
| Research | `claude` | Strong at finding patterns and synthesizing information |
| Plan | `claude` | Best at architectural reasoning |
| Implement | `codex` | Strongest at writing code |
| Validate | `codex` | Good at running and interpreting tests |
| Review (parallel) | `claude` + `codex` | Two models catch different issue types |
| ReviewFix | `codex` | Fixing code is implementation work |
| Report | `claude` | Good at summarization |

Notice the pattern: Codex does the hands-on coding, Claude does the thinking and judging. The review step uses both because that is where coverage matters most.

## Codex Reasoning Effort

The `model_reasoning_effort` config controls how much thinking Codex does before it generates. Higher effort produces better results but costs more time and tokens.

```ts
const codex = new CodexAgent({
  model: "gpt-5.3-codex",
  config: { model_reasoning_effort: "high" },  // default recommendation
});
```

| Level | Use when |
| --- | --- |
| `medium` | Simple, well-defined changes with clear instructions |
| `high` | Default. Most implementation and review tasks |
| `xhigh` | Complex architectural changes, multi-file refactors, tricky edge cases |

When in doubt, use `high`. You can always bump it to `xhigh` for the tasks that keep failing.

## Next Steps

- [Implement-Review Loop](/guides/review-loop) -- The recommended review loop pattern.
- [CLI Agents](/integrations/cli-agents) -- Full reference for ClaudeCodeAgent, CodexAgent, GeminiAgent, PiAgent, KimiAgent.
- [Built-in Tools](/integrations/tools) -- Sandboxed tools for AI SDK agents.

---

## Structured Output

> How Smithers validates agent outputs against Zod schemas, retries on failure, and handles auto-populated columns.
> Source: https://smithers.sh/guides/structured-output

Every `<Task>` produces structured output validated against a schema and persisted to SQLite.

## Schema-Driven Output

```tsx
import { createSmithers, Task } from "smithers-orchestrator";
import { ToolLoopAgent as Agent } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
import { z } from "zod";

const { Workflow, smithers, outputs } = createSmithers({
  analysis: z.object({
    summary: z.string(),
    issues: z.array(z.string()),
    risk: z.enum(["low", "medium", "high"]),
  }),
});

const analyst = new Agent({
  model: anthropic("claude-sonnet-4-20250514"),
  instructions: "Return JSON matching the schema exactly.",
});

export default smithers((ctx) => (
  <Workflow name="structured-output">
    <Task id="analyze" output={outputs.analysis} agent={analyst}>
      {`Analyze this codebase: ${ctx.input.target}.
Return JSON with:
- summary (string)
- issues (string[])
- risk ("low" | "medium" | "high")`}
    </Task>
  </Workflow>
));
```

Downstream tasks consume structured output via `deps`:

```tsx
<Task id="report" output={outputs.report} agent={writer} deps={{ analyze: outputs.analysis }}>
  {(deps) => `Write a report for ${deps.analyze.summary}`}
</Task>
```

## The outputSchema Prop

When a `<Task>` child is a React or MDX element, Smithers auto-injects a `schema` prop -- a JSON example derived from the Zod schema:

```tsx
<Task id="analyze" output={outputs.analysis} agent={analyst} outputSchema={analysisSchema}>
  <AnalysisPrompt repo={ctx.input.repoPath} />
</Task>
```

```mdx
{/* prompts/analysis.mdx */}
Analyze the repository at {props.repo}.

Return JSON matching this schema:
{props.schema}
```

For string children, describe the expected shape in the prompt text. The `outputSchema` prop still participates in validation and cache key computation.

## Validation Flow

1. **JSON extraction** -- Tries structured output, raw JSON, code-fenced JSON, then balanced-brace extraction. If none found, a follow-up prompt requests the JSON.
2. **Auto-populated column stripping** -- `runId`, `nodeId`, `iteration` are stripped before validation. The agent need not include them.
3. **Schema validation** -- Extracted JSON is validated against Zod schema (if set) and Drizzle table schema.
4. **Auto-retry** -- On failure, up to 2 retry prompts with Zod error details:

   ```
   Your previous response did not match the expected schema.
   Errors:
   - issues: Expected array, received string
   - risk: Invalid enum value. Expected 'low' | 'medium' | 'high', received 'moderate'

   Please return valid JSON matching the schema.
   ```

5. **Persistence** -- On success, the row is written with `runId`, `nodeId`, `iteration` auto-populated.

## Auto-Populated Columns

| Column | Type | Description |
|---|---|---|
| `runId` | `string` | Current run ID |
| `nodeId` | `string` | Task `id` prop |
| `iteration` | `integer` | Loop iteration (0 for non-loop tasks) |

These are auto-added by `createSmithers`, stripped from agent responses, and auto-populated on write. Zod schemas should only describe business fields:

```tsx
const analysisSchema = z.object({
  summary: z.string(),
  issues: z.array(z.string()),
});
// Agent returns: { "summary": "...", "issues": ["..."] }
// Smithers adds runId, nodeId, iteration automatically.
```

## Static Mode

Tasks without an `agent` prop write children directly to the database, still validated against the table schema:

```tsx
<Task id="config" output={outputs.config} noRetry>
  {{ environment: "production", version: 3 }}
</Task>
```

Because static payload mismatches are usually deterministic authoring errors, `noRetry` is a good default for one-shot validation. Without it, the normal task retry policy still applies.

## JSON Mode Columns

With `createSmithers`, Zod arrays and objects are automatically stored as JSON text columns:

```tsx
const { Workflow, smithers, outputs } = createSmithers({
  analysis: z.object({
    issues: z.array(z.string()), // stored as JSON text automatically
  }),
});
```

## Combining Zod and Drizzle Schemas

With the manual Drizzle API (without `createSmithers`), pair a Drizzle table with a Zod `outputSchema` for double validation:

```tsx
import { sqliteTable, text, integer, primaryKey } from "drizzle-orm/sqlite-core";

const analysisTable = sqliteTable(
  "analysis",
  {
    runId: text("run_id").notNull(),
    nodeId: text("node_id").notNull(),
    summary: text("summary").notNull(),
    issues: text("issues", { mode: "json" }).$type<string[]>(),
    risk: integer("risk").notNull(),
  },
  (t) => ({
    pk: primaryKey({ columns: [t.runId, t.nodeId] }),
  }),
);

const analysisSchema = z.object({
  summary: z.string(),
  issues: z.array(z.string()),
  risk: z.number().int().min(1).max(10),
});

<Task id="analyze" output={analysisTable} outputSchema={analysisSchema} agent={analyst}>
  Analyze the codebase.
</Task>
```

`outputSchema` validates JSON structure (including the `risk` range); the Drizzle table validates column types and nullability.

## Next Steps

- [Error Handling](/guides/error-handling) -- What happens when validation fails after all retries.
- [Patterns](/guides/patterns) -- Schema organization for larger projects.
- [Data Model](/concepts/data-model) -- Required columns and primary key conventions.

---

## Error Handling

> Retries, timeouts, conditional skipping, and graceful degradation for Smithers workflows.
> Source: https://smithers.sh/guides/error-handling

Agent tasks fail. Models hallucinate invalid JSON. API calls time out. Rate limits kick in at the worst possible moment. The question is not whether your workflow will encounter errors -- it is whether your workflow will handle them gracefully or fall over.

Smithers gives you six mechanisms. Let's look at each one, starting with the simplest.

## Typed Runtime Errors

Smithers runtime failures use typed `SmithersError` objects. Built-in errors expose:

- `code` -- machine-readable discriminator
- `summary` -- raw human-readable message
- `message` -- the summary plus a docs link
- `docsUrl` -- direct link to the error reference

If you catch runtime failures yourself, prefer switching on `KnownSmithersErrorCode` and keep the full code list synced from [Error Reference](/reference/errors).

## Retries

By default, tasks retry indefinitely with exponential backoff (1s, 2s, 4s, 8s, ... capped at 5 minutes). This means most transient failures -- rate limits, model errors, network blips -- are absorbed automatically without any configuration.

You can override the default with the `retries` prop. The value is the number of *additional* attempts after the first failure -- so `retries={2}` means up to 3 total attempts:

```tsx
{/* assuming outputs from createSmithers */}
<Task id="analyze" output={outputs.analysis} agent={analyst} retries={2}>
  Analyze the codebase and return structured JSON.
</Task>
```

To disable retries entirely, use `noRetry` or `retries={0}`:

```tsx
<Task id="validate" output={outputs.check} agent={checker} noRetry>
  One-shot validation -- do not retry.
</Task>
```

Each retry creates a new row in `_smithers_attempts`. Previous attempts are never overwritten -- you can inspect every failure after the fact. Between the failure and the next attempt, a `NodeRetrying` event is emitted.

The task is marked `failed` only after all retries are exhausted. With the default infinite retries, this never happens -- use `smithers cancel` to stop a persistently failing task, or set an explicit `retries` count.

### Schema validation retries

Here is a subtlety that will save you retry budget. When the agent returns JSON that does not match the output schema, Smithers does not immediately burn a `retries` count. Instead, it sends up to 2 follow-up prompts *within the same attempt*, appending the validation errors so the agent can fix its response.

Only if those schema retries also fail does the attempt fail -- and then the `retries` mechanism takes over (if configured).

So `retries={2}` with schema validation gives you up to 9 chances to get a valid response: 3 attempts, each with 3 schema tries. That is usually more than enough.

### Retry Backoff

By default, retries happen immediately -- the next attempt fires as soon as the previous one fails. That is fine for transient model errors, but terrible for rate-limited APIs. The `retryPolicy` prop controls the delay between retries.

Three backoff strategies are available: `fixed`, `linear`, and `exponential`.

**Fixed** waits the same duration every time:

```tsx
{/* 1s, 1s, 1s */}
<Task
  id="api-call"
  retries={3}
  retryPolicy={{ backoff: "fixed", initialDelayMs: 1000 }}
>
  Call the external API.
</Task>
```

Delay = `initialDelayMs` for every attempt. Three retries with `initialDelayMs: 1000` means three 1-second waits.

**Linear** increases the delay proportionally to the attempt number:

```tsx
{/* 1s, 2s, 3s */}
<Task
  id="api-call"
  retries={3}
  retryPolicy={{ backoff: "linear", initialDelayMs: 1000 }}
>
  Call the external API.
</Task>
```

Delay = `initialDelayMs * attempt`. Attempt 1 waits 1s, attempt 2 waits 2s, attempt 3 waits 3s.

**Exponential** doubles the delay each time:

```tsx
{/* 1s, 2s, 4s */}
<Task
  id="api-call"
  retries={3}
  retryPolicy={{ backoff: "exponential", initialDelayMs: 1000 }}
>
  Call the external API.
</Task>
```

Delay = `initialDelayMs * 2^(attempt - 1)`. Attempt 1 waits 1s, attempt 2 waits 2s, attempt 3 waits 4s. This is the right choice for rate-limited external services -- it backs off fast enough to let quotas recover.

If you omit `backoff`, it defaults to `"fixed"`. If you omit `initialDelayMs` or set it to 0, the policy is ignored and retries happen immediately (the same behavior as having no `retryPolicy` at all).

The type is straightforward:

```ts
type RetryPolicy = {
  backoff?: "fixed" | "linear" | "exponential";
  initialDelayMs?: number;
};
```

### Side-effect tool warnings on retry

When a task retries after a previous attempt already executed a non-idempotent side-effect tool call (a tool defined with `sideEffect: true, idempotent: false` via `defineTool`), Smithers injects a warning into the retry prompt. The warning tells the agent that those side effects may already have happened and that it should verify external state before calling them again. Smithers also reuses the same `ctx.idempotencyKey` across retries so your tool implementations can deduplicate.

This matters most when you combine `retryPolicy` with tools that modify external state -- sending emails, creating records, charging payments. The backoff gives external systems time to settle, and the warning prevents the agent from blindly repeating mutations. See [Built-in Tools](/integrations/tools) for details on `defineTool` and the `sideEffect` flag.

## Timeouts

Set `timeoutMs` to limit how long a single attempt can take:

```tsx
{/* assuming outputs from createSmithers */}
<Task id="analyze" output={outputs.analysis} agent={analyst} timeoutMs={60_000} retries={1}>
  Analyze the codebase.
</Task>
```

If the task exceeds the timeout, the attempt fails with a timeout error. If `retries` is set, the task retries. This is your guard against agent calls that hang indefinitely -- a rate-limited API that never responds, a model that gets stuck in a reasoning loop, a network partition.

## continueOnFail

By default, when a task fails (after exhausting all retries), the workflow stops. Sometimes that is not what you want. Linting is nice to have but should not block the final report. Telemetry should not take down your pipeline.

Set `continueOnFail` to let subsequent tasks proceed:

```tsx
{/* assuming outputs from createSmithers */}
<Task id="optional-lint" output={outputs.lint} agent={linter} retries={1} continueOnFail>
  Run lint checks on the codebase.
</Task>

<Task id="report" output={outputs.report} agent={reporter}>
  Generate the final report.
</Task>
```

The `report` task executes even if `optional-lint` fails. The failed task's node state is `failed`, but the workflow continues. Use this for non-critical steps -- linting, optional analysis passes, telemetry.

## skipIf

Sometimes you know at render time that a task should not run. Maybe you are in "quick" mode and do not need a deep analysis. `skipIf` handles this:

```tsx
{/* assuming outputs from createSmithers */}
<Task
  id="deep-analysis"
  output={outputs.analysis}
  agent={analyst}
  skipIf={ctx.input.mode === "quick"}
>
  Run a thorough analysis of the codebase.
</Task>
```

When `skipIf` evaluates to `true`, the task is marked `skipped` immediately. It will not run even if the condition changes on a later render cycle.

**Important**: `skipIf` is evaluated during rendering, not during execution. For tasks that should only run *after* a prerequisite completes, use conditional rendering instead:

```tsx
// Preferred: conditional rendering
// assuming outputs from createSmithers
const analysis = ctx.outputMaybe(outputs.analysis, { nodeId: "analyze" });

{analysis ? (
  <Task id="fix" output={outputs.fix} agent={fixer}>
    {`Fix these issues: ${analysis.summary}`}
  </Task>
) : null}
```

The difference: `skipIf` says "this task exists but should not run." Conditional rendering says "this task does not exist yet."

## Branch for Error Recovery

What if a task might fail, and you want to take a different path depending on the outcome? That is what `<Branch>` is for:

```tsx
import { createSmithers, Task, Sequence, Branch } from "smithers-orchestrator";
import { ToolLoopAgent as Agent } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
import { z } from "zod";

const { Workflow, smithers, outputs } = createSmithers({
  risky: z.object({
    ok: z.boolean(),
    message: z.string(),
  }),
  output: z.object({
    summary: z.string(),
  }),
});

const riskyAgent = new Agent({
  model: anthropic("claude-sonnet-4-20250514"),
  instructions: "Attempt the operation. Return JSON with ok (boolean) and message (string).",
});

export default smithers((ctx) => {
  const risky = ctx.outputMaybe(outputs.risky, { nodeId: "risky" });
  const ok = risky?.ok ?? false;

  return (
    <Workflow name="error-recovery">
      <Sequence>
        <Task id="risky" output={outputs.risky} agent={riskyAgent} retries={2} timeoutMs={30_000}>
          Attempt the operation.
        </Task>

        <Branch
          if={ok}
          then={
            <Task id="summary" output={outputs.output}>
              {{ summary: `Success: ${risky?.message}` }}
            </Task>
          }
          else={
            <Task id="summary" output={outputs.output}>
              {{ summary: `Fallback: operation did not succeed` }}
            </Task>
          }
        />
      </Sequence>
    </Workflow>
  );
});
```

Here is what happens step by step. On the first render, `risky` is `undefined` so `ok` is `false` -- but the `risky` task runs first because it appears earlier in the `<Sequence>`. After `risky` completes, the workflow re-renders, `ok` resolves to the actual value, and the appropriate branch is taken.

The `<Branch>` component does not introduce any magic. It is just conditional rendering with a name.

## Combining Patterns

Real workflows combine multiple error handling patterns. Here is one that uses all of them:

```tsx
// assuming outputs from createSmithers
export default smithers((ctx) => {
  const analysis = ctx.outputMaybe(outputs.analysis, { nodeId: "analyze" });
  const lint = ctx.outputMaybe(outputs.lint, { nodeId: "lint" });

  return (
    <Workflow name="robust-pipeline">
      <Sequence>
        {/* Retries + timeout for the critical analysis step */}
        <Task id="analyze" output={outputs.analysis} agent={analyst} retries={3} timeoutMs={120_000}>
          Analyze the codebase thoroughly.
        </Task>

        {/* Optional lint step -- continues even if it fails */}
        {analysis ? (
          <Task id="lint" output={outputs.lint} agent={linter} retries={1} continueOnFail>
            {`Lint the files: ${analysis.filesAnalyzed.join(", ")}`}
          </Task>
        ) : null}

        {/* Skip the detailed report in quick mode */}
        {analysis ? (
          <Task
            id="report"
            output={outputs.report}
            agent={reporter}
            skipIf={ctx.input.mode === "quick"}
          >
            {`Generate a detailed report.
Analysis: ${analysis.summary}
Lint results: ${lint?.issues?.join(", ") ?? "lint skipped or failed"}`}
          </Task>
        ) : null}

        {/* Always produce a final summary */}
        {analysis ? (
          <Task id="final" output={outputs.output}>
            {{ summary: analysis.summary, lintPassed: lint?.passed ?? null }}
          </Task>
        ) : null}
      </Sequence>
    </Workflow>
  );
});
```

Read the comments. Each task uses a different error handling strategy based on how critical it is. The analysis step retries aggressively -- it is the foundation. The lint step uses `continueOnFail` -- nice to have, not essential. The report uses `skipIf` -- unnecessary in quick mode. The final summary always runs.

## Error Handling Summary

| Mechanism | Prop | Effect |
|---|---|---|
| **Retries** | `retries={N}` | Retry up to N times after failure. Default: `Infinity` (retry forever). Each attempt is recorded. |
| **No retry** | `noRetry` | Disable retries. Equivalent to `retries={0}`. |
| **Retry backoff** | `retryPolicy={{ backoff, initialDelayMs }}` | Control delay between retries: `fixed`, `linear`, or `exponential`. Default: exponential from 1s, capped at 5 min. |
| **Timeout** | `timeoutMs={N}` | Fail the attempt after N milliseconds. Combines with retries. |
| **Continue on fail** | `continueOnFail` | Let subsequent tasks run even if this task fails. |
| **Skip** | `skipIf={boolean}` | Skip the task at render time. Evaluated once per render cycle. |
| **Branch** | `<Branch if={...} then={...} else={...} />` | Route to different tasks based on a condition. |
| **Conditional rendering** | `{condition ? <Task /> : null}` | Mount tasks only when prerequisites are available. |

## Next Steps

- [Resumability](/guides/resumability) -- How failed runs can be resumed after fixing issues.
- [Debugging](/guides/debugging) -- Inspect failed attempts and error details.
- [Error Reference](/reference/errors) -- Exhaustive built-in runtime error codes and details.
- [Execution Model](/concepts/execution-model) -- How retries and node states work internally.

---

## Resumability

> How Smithers persists state to SQLite and resumes interrupted runs deterministically.
> Source: https://smithers.sh/guides/resumability

Long-running workflows crash. Networks fail. Processes get killed. Deploys happen. If your workflow runs for two hours and dies at minute 119, you do not want to start over.

Smithers persists every task's output to SQLite as it completes. When you resume a run, it skips the tasks that already finished and picks up from the ones that did not. The result: minutes of recovery instead of hours of re-execution.

## How It Works

Every task output is written to SQLite keyed by `(runId, nodeId, iteration)`. When you resume, Smithers re-renders the JSX tree with the persisted outputs already available in `ctx`. Tasks with valid output rows are marked `finished` and skipped. Tasks that were in-progress or pending are picked up from where they left off.

The resume flow, step by step:

1. **Load existing state** -- Smithers reads `_smithers_runs`, `_smithers_nodes`, and `_smithers_attempts` for the given `runId`.
2. **Metadata check** -- The stored workflow path, workflow file hash, and VCS metadata are compared against the current environment. If they changed, resume fails fast. This prevents you from accidentally running new code against old state.
3. **Stale attempt cleanup** -- Any in-progress attempts older than 15 minutes are automatically cancelled. This prevents zombie tasks from blocking forward progress. The associated nodes are reset to `pending`.
4. **Re-render** -- The JSX tree is rendered with the current `ctx`, which includes all previously persisted outputs. Completed tasks are naturally skipped because their output already exists.
5. **Resume execution** -- The engine schedules and executes any remaining runnable tasks.

That is it. No manual checkpointing. No state serialization code. You get resumability by using task IDs correctly.

## Deterministic Node IDs

Resumability lives or dies by stable, deterministic node identity. A task's identity comes from its `id` prop:

```tsx
{/* assuming outputs from createSmithers */}
<Task id="analyze" output={outputs.analysis} agent={analyst}>
  Analyze the codebase.
</Task>
```

The `nodeId` in the database is `"analyze"`. If you rename the `id` prop between runs, Smithers treats it as a new task and the old output is orphaned -- sitting in the database, unused, while the "new" task starts from scratch.

**Rules for stable IDs:**

- Use fixed, descriptive strings for static tasks: `id="analyze"`, `id="report"`.
- For dynamic tasks, derive the ID from a stable identifier: `id={`${ticket.id}:implement`}`.
- Never use array indices or timestamps as IDs. They change between renders.

This is the single most important thing to get right for resumability. Everything else follows from it.

## Resume via CLI

Start a run, then resume it later:

```bash
# Start the run
bunx smithers up workflow.tsx --run-id my-run --input '{"description": "Fix auth bugs"}'

# Process crashes or is cancelled...

# Resume the same run
bunx smithers up workflow.tsx --run-id my-run --resume true
```

On resume, the input row must already exist in the database. Smithers will throw an error if it is missing. You do not need to pass `--input` again -- it was persisted on the first run.

## Resume Programmatically

```ts
import { runWorkflow } from "smithers-orchestrator";
import workflow from "./workflow";

// Initial run
const result1 = await runWorkflow(workflow, {
  runId: "my-run",
  input: { description: "Fix auth bugs" },
});

// result1.status might be "failed" or "waiting-approval"

// Resume the same run later
const result2 = await runWorkflow(workflow, {
  runId: "my-run",
  resume: true,
});

// result2 picks up from where result1 left off
```

When `resume: true` is set, Smithers loads the existing run state instead of creating a new run.

## What Gets Skipped on Resume

| Node state before resume | Behavior on resume |
|---|---|
| `finished` | Skipped. Output row exists and is valid. |
| `skipped` | Remains skipped. |
| `failed` (retries exhausted) | Stays failed unless the workflow code changed to allow more retries. |
| `in-progress` (stale) | Cancelled after 15 minutes, then retried as `pending`. |
| `in-progress` (recent) | Left in-progress. If the process died, the attempt will time out and be cleaned up on the next resume. |
| `pending` | Scheduled for execution. |
| `waiting-approval` | Stays waiting. Approve or deny to unblock. |
| `cancelled` | Stays cancelled. |

The 15-minute threshold for stale attempts deserves explanation. Why not cancel immediately? Because some tasks legitimately run for a long time -- a complex implementation step with a 30-minute timeout, for example. Cancelling it prematurely would waste the work already done. Fifteen minutes is a conservative default that catches zombie processes without killing slow-but-alive ones.

## Stale Attempt Recovery

If a process crashes mid-execution, some tasks may be stuck in `in-progress` state with no process to complete them. Smithers handles this automatically:

- On resume, any in-progress attempt with a `started_at_ms` older than 15 minutes is marked `cancelled`.
- The associated node is reset to `pending`.
- The task will be picked up on the next scheduling pass.

No manual intervention required.

## Common Resume Scenarios

### Crash during execution

```bash
# Start a run -- crashes midway through "implement"
bunx smithers up workflow.tsx --run-id run-1 --input '{"repo": "/my-project"}'

# "analyze" finished, "implement" was in-progress, "report" was pending
# Resume picks up from "implement"
bunx smithers up workflow.tsx --run-id run-1 --resume true
```

### Waiting for approval

```bash
# Run pauses at an approval gate
bunx smithers up workflow.tsx --run-id run-2 --input '{"repo": "/my-project"}'
# Status: waiting-approval

# Approve the pending node
bunx smithers approve run-2 --node deploy

# Resume to continue execution
bunx smithers up workflow.tsx --run-id run-2 --resume true
```

### Fixing a bug and retrying

If a task failed because of a bug in your workflow code, you have two options:

1. Fix the code and start a fresh run.
2. Fix the code and resume -- but only if the workflow file hash has not changed, which it has, because you just fixed it.

In practice, this means: if the failure was in your code, start a new run. If the failure was transient (network, rate limit, model hiccup), resume.

```bash
# Original run failed at "analyze" because of a prompt bug
# Fix the prompt in workflow.tsx, then start a new run
bunx smithers up workflow.tsx --input '{"repo": "/my-project"}'
```

Smithers stores workflow and repository metadata in `_smithers_runs` and requires them to match on resume. This is intentional -- it keeps resume deterministic. Running changed code against old state is a recipe for subtle bugs.

## Database Tables

Smithers uses these internal tables for resume state. You can query them directly for debugging:

```bash
# View run status
sqlite3 smithers.db "SELECT run_id, status, created_at_ms FROM _smithers_runs WHERE run_id = 'my-run';"

# View node states
sqlite3 smithers.db "SELECT node_id, status, iteration FROM _smithers_nodes WHERE run_id = 'my-run' ORDER BY updated_at_ms;"

# View attempts
sqlite3 smithers.db "SELECT node_id, attempt, status, started_at_ms FROM _smithers_attempts WHERE run_id = 'my-run' ORDER BY started_at_ms;"
```

## Tips

- **Always use stable task IDs.** This is worth repeating. Changing IDs between runs breaks resume because the engine cannot match old output rows to new task nodes.
- **Test resume in development.** Run your workflow, cancel it partway through, and resume to verify it picks up correctly. Do this before your first production run, not after.
- **Check for stale runs.** Use `bunx smithers ps --status running` to find runs that may need to be resumed or cancelled.
- **Input immutability.** Once a run starts, the input is persisted. Passing different input on resume is an error. This is by design -- the input is part of the run's identity.

## Next Steps

- [Debugging](/guides/debugging) -- Inspect run state and diagnose resume issues.
- [Execution Model](/concepts/execution-model) -- Understand the render-schedule-execute loop that drives resume.
- [VCS Integration](/guides/vcs) -- Revert filesystem changes to a specific attempt.

---

## Hot Reload

> Edit workflow code while a run is executing -- prompts, config, agents, and component structure update live without restarting.
> Source: https://smithers.sh/guides/hot-reload

Your workflow has been running for forty minutes. Three tasks are done. The fourth is in progress. You just realized the prompt for the fifth task is wrong.

Without hot reload, you kill the process, fix the prompt, and start over. Forty minutes gone.

With hot reload, you save the file. The engine picks up the change. The in-flight task finishes with its original prompt. The fifth task uses your new prompt. Zero wasted work.

## Quick Start

Add `--hot true` to any `up` command:

```bash
smithers up workflow.tsx --hot true
smithers up workflow.tsx --run-id abc123 --resume true --hot true
```

That is it. Edit any file in your workflow's directory tree, save, and the engine picks up the changes on the next render cycle.

## How It Works

Smithers is built on React. Your workflow's `build(ctx)` function is a React component tree that the engine re-renders every loop iteration using a [custom React reconciler](/concepts/planner-internals). All run state lives in SQLite, not in the React fiber tree.

Hot reload leverages that architecture in five steps:

1. **Watch** -- The engine watches your workflow's directory tree (excluding `node_modules/`, `.git/`, `.jj/`, `.smithers/`).
2. **Overlay** -- On file change, Smithers creates a "generation overlay" -- a copy of your source tree with fresh file URLs so every module (including transitive dependencies) is re-evaluated.
3. **Import** -- The new workflow module is imported from the overlay. `createSmithers()` returns the cached DB connection and schema maps (no duplicate connections).
4. **Swap** -- Only `workflow.build` is replaced. The database, schema registry, and all persisted state remain untouched.
5. **Wake** -- The engine loop is woken immediately (even if tasks are still running) so it re-renders with the new code.

```
File saved -> watcher detects -> overlay built -> module imported -> build swapped -> re-render
                                                                                       |
                                                                         new tasks use new code
                                                                         in-flight tasks unaffected
```

The key insight: state is in SQLite, not in your code. Swapping the code does not lose state.

## What You Can Change Live

These changes take effect on the next render cycle:

| Change | Effect |
|---|---|
| Prompt strings / `.md` files | New tasks get the updated prompt |
| Focus lists, config values | Scheduler sees new priorities |
| Agent configuration (model, timeout, system prompt) | New agent instances for new tasks |
| JSX tree structure (add/remove/reorder tasks) | New plan tree on next render |
| Concurrency / retry settings | Applied to newly scheduled tasks |

## What Requires a Restart

These changes are blocked to prevent data corruption:

| Change | Why |
|---|---|
| Output Zod schemas (shape changes) | Schema identity is used for output table resolution |
| `createSmithers()` dbPath | Would create a second database connection |
| Adding/removing output schema keys | Changes the schema registry |

If you attempt a schema change, the engine logs a warning and keeps running with the previous code:

```
[00:12:34] Warning: Workflow reload blocked: Schema change detected; restart required to apply schema changes.
```

Why the hard line on schemas? Because output tables are keyed by schema identity. If you change a schema mid-run, existing rows would no longer match the new shape. That is data corruption. So Smithers refuses and tells you to restart.

## In-Flight Task Behavior

When a hot reload changes the task graph, tasks that are already running are **not cancelled**. They continue with the code they were launched with and their output is persisted normally.

This means:

- A task started with prompt v1 will finish with v1, even if you have since saved v2.
- If a reload removes a task from the tree, its in-flight attempt still completes. The output may go unused by downstream tasks.
- If a reload changes a task's `id`, the old in-flight attempt is treated as a different node. Both may run.

That last point is worth repeating. Changing a task ID during a hot reload does not "rename" the task. It creates a new one. The old one is orphaned.

## CLI Output

When hot reload detects and applies changes, you will see:

```
[00:05:12] File change detected: 1 file(s)
[00:05:12] Workflow reloaded (generation 1)
```

On errors:

```
[00:05:12] Warning: Workflow reload failed: SyntaxError: Unexpected token
```

The workflow continues running with the last valid code. Fix the error and save again. No panic required.

## Events

Hot reload emits events through the standard [event system](/runtime/events):

| Event | When |
|---|---|
| `WorkflowReloadDetected` | File changes detected (before reload attempt) |
| `WorkflowReloaded` | Reload succeeded; includes `generation` number and `changedFiles` |
| `WorkflowReloadFailed` | Reload failed (syntax error, import error); includes `error` |
| `WorkflowReloadUnsafe` | Reload blocked (schema change); includes `reason` |

These events are persisted to the NDJSON event log and visible via `onProgress`.

## Options

| CLI Flag | RunOptions field | Description | Default |
|---|---|---|---|
| `--hot` | `hot: true` | Enable hot reload | Disabled |

Advanced options (via `RunOptions.hot` object):

| Field | Description | Default |
|---|---|---|
| `rootDir` | Directory to watch | Workflow file's directory |
| `outDir` | Overlay output directory | `.smithers/hmr` |
| `maxGenerations` | Number of overlay generations to keep | `3` |
| `cancelUnmounted` | Cancel in-flight tasks that become unmounted after reload | `false` |
| `debounceMs` | Debounce interval for file change events | `100` |

## Tips

### Keep prompts in separate files

If your prompts live in `.md` or `.ts` files imported by your workflow, editing them triggers a hot reload automatically. This is the most common workflow: edit a prompt, save, watch the next task pick it up.

```ts
// prompts/planning.md changes -> hot reload -> new tasks use updated prompt
import planningPrompt from "./prompts/planning.md";

export default smithers((ctx) => (
  <Workflow name="my-workflow">
    <Task id="plan" agent={new ClaudeCodeAgent({ systemPrompt: planningPrompt })}>
      ...
    </Task>
  </Workflow>
));
```

### Avoid module-scope side effects

Code that runs at import time (outside of `build()`) is re-executed on every reload. If your module-level code opens a file, starts a server, or prints a banner, that will happen again on every save. Keep side effects inside `build()` or use `createSmithers()` which is automatically cached in hot mode.

### Use with resumability

Hot reload and [resumability](/guides/resumability) work together naturally. You can:

1. Start a run with `--hot`
2. Edit files while it runs
3. Kill the process (Ctrl+C)
4. Resume with `smithers up workflow.tsx --run-id <id> --resume true --hot true`

The resumed run picks up your latest code and continues watching for changes.

## Related

- [Execution Model](/concepts/execution-model) -- How the render-schedule-execute loop works.
- [Resumability](/guides/resumability) -- How crash recovery preserves state.
- [Events](/runtime/events) -- Subscribing to lifecycle events.
- [CLI Reference](/cli/overview) -- Full CLI flag reference.

---

## MDX Prompts

> Using MDX files for prompt templates and system prompt composition in Smithers workflows.
> Source: https://smithers.sh/guides/mdx-prompts

MDX separates prompt text from orchestration logic. Smithers uses it two ways:

1. **Per-step prompts** -- `.mdx` files with `{props.*}` interpolation, used as `<Task>` children.
2. **System prompt composition** -- A master `.mdx` template that assembles context from multiple markdown docs.

## Setup

```ts
// preload.ts
import { mdxPlugin } from "smithers-orchestrator";
mdxPlugin();
```

```toml
# bunfig.toml
preload = ["./preload.ts"]
```

## Per-Step Prompts

```mdx
{/* components/Review.mdx */}
CODE REVIEW -- Ticket: {props.ticketId} -- {props.ticketTitle}

Reviewer: {props.reviewer}

TICKET DESCRIPTION:
{props.ticketDescription}

ACCEPTANCE CRITERIA:
- {props.acceptanceCriteria}

FILES CHANGED:
Created: {JSON.stringify(props.filesCreated)}
Modified: {JSON.stringify(props.filesModified)}

{props.failingSummary
  ? `VALIDATION FAILURES:\n${props.failingSummary}`
  : "All tests passing."}

**REQUIRED OUTPUT** -- JSON matching this schema:
{props.schema}
```

```tsx
// components/Review.tsx
import { Task } from "../smithers";
import { claude } from "../agents";
import ReviewPrompt from "./Review.mdx";

export function Review({ ticket }) {
  return (
    <Task id={`${ticket.id}:review`} output={outputs.review} agent={claude}>
      <ReviewPrompt
        ticketId={ticket.id}
        ticketTitle={ticket.title}
        ticketDescription={ticket.description}
        acceptanceCriteria={ticket.acceptanceCriteria?.join("\n- ") ?? ""}
        filesCreated={["src/auth.ts"]}
        filesModified={["src/index.ts"]}
        failingSummary={null}
        reviewer="claude"
      />
    </Task>
  );
}
```

### Auto-injected `{props.schema}`

When a `<Task>` has an output schema (explicit or from `createSmithers()`), Smithers auto-injects a `schema` prop containing a human-readable JSON example from the Zod schema. No manual passing required.

## System Prompt Composition

### 1. Standalone docs

```
prompts/
  system-prompt.mdx     # Master template
  architecture.md       # Architecture docs
  coding-standards.md   # Coding conventions
  git-rules.md          # Git commit conventions
  always-green.md       # "Keep tests passing" rules
```

### 2. Component functions

```ts
import { readFileSync } from "node:fs";
import { resolve } from "node:path";
import { renderMdx } from "smithers-orchestrator";
import SystemPromptMdx from "./prompts/system-prompt.mdx";

const ROOT = resolve(new URL("../..", import.meta.url).pathname);
const PROMPTS = resolve(new URL("./prompts", import.meta.url).pathname);

function readDoc(path: string): string {
  try { return readFileSync(resolve(ROOT, path), "utf8"); }
  catch { return `[Could not read ${path}]`; }
}

function readPrompt(filename: string): string {
  try { return readFileSync(resolve(PROMPTS, filename), "utf8"); }
  catch { return `[Could not read prompt: ${filename}]`; }
}

const ClaudeMd = () => readDoc("CLAUDE.md");
const Architecture = () => readPrompt("architecture.md");
const CodingStandards = () => readPrompt("coding-standards.md");
const GitRules = () => readPrompt("git-rules.md");
const AlwaysGreen = () => readPrompt("always-green.md");

export const SYSTEM_PROMPT = renderMdx(SystemPromptMdx, {
  components: {
    ClaudeMd,
    Architecture,
    CodingStandards,
    GitRules,
    AlwaysGreen,
  },
});
```

### 3. Master template

```mdx
# My Project

You are building [project description].

## Project Conventions

<ClaudeMd />

## Architecture

<Architecture />

## Coding Standards

<CodingStandards />

## Git Rules

<GitRules />

## Quality Rules

<AlwaysGreen />

## JSON Output Requirement

MUST end response with JSON object in code fence. Format specified in task prompt.
```

Each `.md` is standalone, reusable across workflows, and version-controlled. Add or remove sections by adding or removing a component tag.

## Conditional Sections

```mdx
{props.previousAttempt
  ? `PREVIOUS ATTEMPT FAILED:
What was done: ${props.previousAttempt.whatWasDone}
Test output: ${props.previousAttempt.testOutput}
Fix the issues above before proceeding.`
  : "This is the first attempt. Start fresh."}

{props.reviewFixes
  ? `REVIEW FIXES NEEDED:\n${props.reviewFixes}`
  : ""}
```

## Array Rendering

```mdx
FILES TO MODIFY:
{props.filesToModify.map(f => `- ${f}`).join("\n")}

ACCEPTANCE CRITERIA:
{props.acceptanceCriteria.map((c, i) => `${i + 1}. ${c}`).join("\n")}

REVIEW ISSUES:
{JSON.stringify(props.issues, null, 2)}
```

## Next Steps

- [Production Project Structure](/guides/project-structure) -- Full recommended file layout.
- [Structured Output](/guides/structured-output) -- How `outputSchema` and `{props.schema}` work together.
- [Patterns](/guides/patterns) -- Naming conventions and organizational patterns.

---

## Third-Party React Hooks

> How standard React hooks, TanStack Query, Zustand, and other React libraries fit inside Smithers workflows.
> Source: https://smithers.sh/guides/third-party-hooks

Smithers workflows are React components.

That means you are not locked into a special workflow-only hook system. If a hook works with the React renderer Smithers uses, you can usually use it inside workflow components too.

The important distinction is durability:

- React hook state is process-local
- Smithers workflow state is durable and lives in SQLite outputs

Use hooks for local render-time coordination. Use task outputs, `ctx.outputMaybe()`, `ctx.latest()`, and output tables for anything the workflow must remember after a crash, restart, or resume.

## What Smithers Provides Natively

Smithers already gives you the workflow runtime itself:

- JSX workflow components such as `<Workflow>`, `<Task>`, `<Sequence>`, `<Parallel>`, `<Branch>`, `<Loop>`, `<Approval>`, `<Signal>`, `<Timer>`, and `<WaitForEvent>`
- Built-in agents such as `ClaudeCodeAgent`, `CodexAgent`, `OpenAIAgent`, `AnthropicAgent`, and others
- Built-in tools such as `read`, `write`, `edit`, `grep`, `bash`, and `defineTool`
- OpenAPI helpers such as `createOpenApiTools()` and `createOpenApiTool()`
- Remote control surfaces such as `Gateway`, `startServer()`, `createServeApp()`, and `signalRun()`
- One root React hook export: `usePatched()`

From `createSmithers()` you also get a workflow-scoped `useCtx()` hook:

```tsx
const { smithers, Workflow, Task, outputs, useCtx } = createSmithers({
  result: z.object({ summary: z.string() }),
});
```

That is the main native hook you use to read workflow input and outputs inside reusable components.

## Core React Hooks In Smithers

These hooks work, but they do not replace durable workflow state.

| Hook | Works? | Good for | Do not use it for |
| --- | --- | --- | --- |
| `useState` | Yes | Local derived state, toggles, cached prompt fragments | Durable workflow facts |
| `useEffect` | Yes | Process-local setup and synchronization | Authoritative side effects you must survive resume |
| `useRef` | Yes | Stable client instances, counters, scratch caches | Persistence |
| `useMemo` | Yes | Expensive derived values, stable providers and clients | Cross-run caching |

### `useState`

`useState` can trigger re-renders inside workflow components just like normal React:

```tsx
function PromptMode() {
  const ctx = useCtx();
  const [mode, setMode] = React.useState("summary");

  React.useEffect(() => {
    if (ctx.input.verbose === true) {
      setMode("detailed");
    }
  }, [ctx.input.verbose]);

  return (
    <Task id="draft" output={outputs.result}>
      {{ summary: `Mode: ${mode}` }}
    </Task>
  );
}
```

That is fine for local render behavior. It is not fine for anything the run must still know after the process dies.

If the value matters to downstream workflow logic, write it to an output table instead of keeping it only in React state.

### `useEffect`

`useEffect` runs and can produce observable changes in a live workflow process, but treat it as a process-local helper, not your durable execution layer.

Good uses:

- initialize local state
- hydrate an in-memory cache
- wire up a `QueryClient`
- keep a derived value in sync with props or `ctx.input`

Avoid:

- sending irreversible API mutations from an effect
- making business-critical decisions only in effect state
- assuming an effect has "already run" after resume

If the action matters, put it in a `<Task>` or a tool call so Smithers can persist it, retry it, and reason about it.

### `useRef`

`useRef` is useful for process-local objects that should survive re-renders without causing new ones:

```tsx
function WorkflowCache() {
  const cache = React.useRef(new Map<string, unknown>());

  return (
    <Task id="cache-size" output={outputs.result}>
      {{ summary: `entries=${cache.current.size}` }}
    </Task>
  );
}
```

Use it for clients, maps, counters, or temporary caches. On restart or resume, the ref resets.

### `useMemo`

`useMemo` is the right tool for stable provider values and expensive derived objects:

```tsx
const queryClient = React.useMemo(
  () =>
    new QueryClient({
      defaultOptions: { queries: { retry: false } },
    }),
  [],
);
```

This is especially useful when you want to embed provider-based libraries such as TanStack Query inside a workflow component tree.

## TanStack Query

TanStack Query works well in Smithers when you want to fetch external context during workflow rendering or share cached fetch results across multiple components in one live run.

### `useQuery`

```tsx
/** @jsxImportSource smithers-orchestrator */
import React from "react";
import { QueryClient, QueryClientProvider, useQuery } from "@tanstack/react-query";
import { Task, Workflow, createSmithers } from "smithers-orchestrator";
import { z } from "zod";

const { smithers, outputs, useCtx } = createSmithers({
  result: z.object({
    repo: z.string(),
    stars: z.number(),
  }),
});

function RepoContext() {
  const ctx = useCtx();
  const { data } = useQuery({
    queryKey: ["repo", ctx.input.owner, ctx.input.repo],
    queryFn: async () => {
      const res = await fetch(`https://api.github.com/repos/${ctx.input.owner}/${ctx.input.repo}`);
      return await res.json();
    },
    retry: false,
  });

  if (!data) {
    return null;
  }

  return (
    <Task id="record" output={outputs.result}>
      {{
        repo: data.full_name,
        stars: data.stargazers_count,
      }}
    </Task>
  );
}

function QueryShell() {
  const queryClient = React.useMemo(
    () => new QueryClient({ defaultOptions: { queries: { retry: false } } }),
    [],
  );

  return (
    <QueryClientProvider client={queryClient}>
      <RepoContext />
    </QueryClientProvider>
  );
}

export default smithers(() => (
  <Workflow name="repo-context">
    <QueryShell />
  </Workflow>
));
```

### `useMutation`

`useMutation` is useful when you want a reusable mutation client in component scope, but still execute the actual mutation from a durable task:

```tsx
import { useMutation } from "@tanstack/react-query";

function SlackPublisher() {
  const postMessage = useMutation({
    mutationFn: async (input: { channel: string; text: string }) => {
      const res = await fetch("https://slack.example.com/messages", {
        method: "POST",
        headers: { "content-type": "application/json" },
        body: JSON.stringify(input),
      });
      return await res.json();
    },
  });

  return (
    <Task id="publish" output={outputs.publish}>
      {async () => {
        const result = await postMessage.mutateAsync({
          channel: "ops",
          text: "Workflow completed",
        });
        return { ts: result.ts };
      }}
    </Task>
  );
}
```

### TanStack Query Rules Of Thumb

- Put the `QueryClient` behind `useMemo`
- Prefer `retry: false` unless you explicitly want another retry layer
- Treat the query cache as local optimization, not workflow truth
- If fetched data matters later, write it to an output table

## Zustand

Zustand is fine for ephemeral local state shared across multiple workflow components in one process.

```tsx
import { create } from "zustand";

const useScratchStore = create<{
  promptStyle: "short" | "long";
  setPromptStyle: (value: "short" | "long") => void;
}>((set) => ({
  promptStyle: "short",
  setPromptStyle: (value) => set({ promptStyle: value }),
}));

function PromptStyleTask() {
  const promptStyle = useScratchStore((state) => state.promptStyle);

  return (
    <Task id="style" output={outputs.result}>
      {{ summary: `style=${promptStyle}` }}
    </Task>
  );
}
```

Use Zustand when it helps component ergonomics. Do not mistake it for workflow persistence.

> Warning: if the state must survive resume, put it in SQLite output tables instead. Zustand stores are in-memory only.

## Vercel AI SDK

Smithers uses the [`ai`](https://sdk.vercel.ai/docs) package internally, so it already fits naturally with AI SDK agents, tools, and streams.

That matters in two ways:

- Inside workflows, you can use AI SDK-compatible agents and tool objects directly
- Outside workflows, `@ai-sdk/react` is a strong choice for dashboards or operator UIs talking to Smithers over gateway or server endpoints

Typical split:

- `ai` package inside the workflow runtime
- `@ai-sdk/react` in the browser or admin UI

If you are building a control panel for approvals or bot conversations around Smithers, the AI SDK React hooks are often a clean match.

## Other Useful Libraries

These are not Smithers-specific, but they pair well with workflow code:

- `swr` for a lighter-weight fetch cache than TanStack Query
- `react-hook-form` for approval or operator UIs that sit on top of gateway endpoints
- `react-error-boundary` for wrapping provider-heavy helper components
- `use-context-selector` when you build large workflow helper trees with custom providers

## Practical Rules

- Use React hooks for local orchestration convenience
- Use Smithers outputs and `ctx` for durable workflow state
- Put side effects that matter inside `<Task>` bodies or tools
- Memoize clients and providers with `useMemo`
- Assume hook state disappears on restart unless you persist it yourself

## Next Steps

- [Reactivity](/concepts/reactivity)
- [Workflow State](/concepts/workflow-state)
- [Built-in Tools](/integrations/tools)
- [OpenAPI Tools](/concepts/openapi-tools)

---

## Dynamic Tickets

> Agent-driven ticket discovery for large projects instead of hardcoded task lists.
> Source: https://smithers.sh/guides/dynamic-tickets

For projects with more than ~20 tasks, hardcoded task lists are fragile. Smithers supports **dynamic ticket discovery**: an agent explores the codebase, compares state to specs, and generates the next batch of tickets at runtime.

| Approach | Best for | Example |
| --- | --- | --- |
| **Dynamic discovery** | Large, evolving projects (>20 tasks) | Building a full application from a PRD |
| **Hardcoded tasks** | Focused features (&lt;20 tasks) | Adding auth, fixing a bug, a specific refactor |

## Discovery Pattern

Three parts: Discover generates tickets, TicketPipeline processes each one, re-render triggers the next batch.

```tsx
// workflow.tsx
import { Sequence, Branch } from "smithers-orchestrator";
import { Discover, TicketPipeline } from "./components";
import { Ticket } from "./components/Discover.schema";
import { Workflow, smithers, tables, outputs } from "./smithers";

export default smithers((ctx) => {
  const discoverOutput = ctx.latest(tables.discover, "discover-codex");
  const unfinishedTickets = ctx
    .latestArray(discoverOutput?.tickets, Ticket)
    .filter((t) => !ctx.latest(tables.report, `${t.id}:report`)) as Ticket[];

  return (
    <Workflow name="my-project">
      <Sequence>
        <Branch if={unfinishedTickets.length === 0} then={<Discover />} />
        {unfinishedTickets.map((ticket) => (
          <TicketPipeline key={ticket.id} ticket={ticket} />
        ))}
      </Sequence>
    </Workflow>
  );
});
```

Execution flow:

1. First render: `unfinishedTickets` is empty, `<Discover />` mounts and runs.
2. Discover persists tickets to the `discover` table.
3. Re-render: tickets exist, Branch is false, `<TicketPipeline>` components mount.
4. Each pipeline runs research, planning, implementation, review.
5. Completed tickets write to `report` table.
6. When all tickets have reports, `unfinishedTickets` empties, Discover runs again.

## Discover Component

```tsx
// components/Discover.tsx
import { codex } from "../agents";
import DiscoverPrompt from "./Discover.mdx";
import { Task, useCtx, tables, outputs } from "../smithers";
import { Ticket } from "./Discover.schema";

export function Discover() {
  const ctx = useCtx();

  const discoverOutput = ctx.latest(tables.discover, "discover-codex");
  const allTickets = ctx.latestArray(discoverOutput?.tickets, Ticket);
  const completedIds = allTickets
    .filter((t) => !!ctx.latest(tables.report, `${t.id}:report`))
    .map((t) => t.id);

  const previousRun = completedIds.length > 0
    ? { summary: `Tickets completed: ${completedIds.join(", ")}`, ticketsCompleted: completedIds }
    : null;

  return (
    <Task id="discover-codex" output={outputs.discover} agent={codex}>
      <DiscoverPrompt previousRun={previousRun} />
    </Task>
  );
}
```

## Ticket Schema

```ts
// components/Discover.schema.ts
import { z } from "zod";

export const Ticket = z.object({
  id: z.string().describe("Unique slug identifier (lowercase kebab-case, e.g. 'add-auth-middleware')"),
  title: z.string().describe("Short imperative title"),
  description: z.string().describe("Detailed description of what needs to be implemented"),
  acceptanceCriteria: z.array(z.string()).describe("List of acceptance criteria"),
  dependencies: z.array(z.string()).nullable().describe("IDs of tickets this depends on"),
});
export type Ticket = z.infer<typeof Ticket>;

export const DiscoverOutput = z.object({
  tickets: z.array(Ticket).max(5).describe("The next 0-5 tickets to implement"),
  reasoning: z.string().describe("Why these tickets were chosen and in this order"),
});
export type DiscoverOutput = z.infer<typeof DiscoverOutput>;
```

Ticket ID rules:

- Lowercase kebab-case slugs derived from the title (e.g. `sqlite-wal-init`, `add-auth-middleware`).
- Never numeric IDs like `T-001` -- they collide across discovery runs.
- 2-5 words, short but descriptive.

## TicketPipeline

```tsx
// components/TicketPipeline.tsx
import { Sequence } from "smithers-orchestrator";
import { Research } from "./Research";
import { Plan } from "./Plan";
import { ValidationLoop } from "./ValidationLoop";
import { Report } from "./Report";
import { useCtx, tables } from "../smithers";
import type { Ticket } from "./Discover.schema";

export function TicketPipeline({ ticket }: { ticket: Ticket }) {
  const ctx = useCtx();
  const latestReport = ctx.latest(tables.report, `${ticket.id}:report`);
  const ticketComplete = latestReport != null;

  return (
    <Sequence key={ticket.id} skipIf={ticketComplete}>
      <Research ticket={ticket} />
      <Plan ticket={ticket} />
      <ValidationLoop ticket={ticket} />
      <Report ticket={ticket} />
    </Sequence>
  );
}
```

## Discover Prompt Guidelines

1. Compare specs vs. current codebase state.
2. Prioritize foundational work (infrastructure, types) before dependent features.
3. Keep tickets small -- smallest independently testable unit.
4. Pass completed ticket IDs to avoid re-discovering finished work.
5. Limit to 3-5 tickets per batch so each batch benefits from prior implementation context.

## Hardcoded Tasks (Smaller Projects)

```tsx
const tasks = [
  { id: "auth-types", name: "Add auth types", description: "Define User, Session, Token types" },
  { id: "auth-middleware", name: "Add auth middleware", description: "JWT validation middleware" },
  { id: "auth-routes", name: "Add auth routes", description: "Login, logout, refresh endpoints" },
  { id: "auth-tests", name: "Add auth tests", description: "Unit and integration tests" },
];

export default smithers((ctx) => (
  <Workflow name="add-auth">
    <Sequence>
      {tasks.map(({ id, name, description }) => (
        <Sequence key={id}>
          <Task id={`${id}:implement`} output={outputs.implement} agent={codex}>
            {`Implement: ${name}\n\n${description}`}
          </Task>
          <Task id={`${id}:validate`} output={outputs.validate} agent={codex}>
            {`Run tests for: ${name}`}
          </Task>
        </Sequence>
      ))}
    </Sequence>
  </Workflow>
));
```

Each task can still be wrapped in a [review loop](/guides/review-loop).

## Sprint-Based Discovery

For very large projects, wrap the workflow in a `<Loop>` with a sprint tracker:

```tsx
export default smithers((ctx) => {
  const tracker = ctx.latest(tables.output, "sprint-tracker") as { sprintsCompleted?: number } | undefined;
  const currentSprint = tracker?.sprintsCompleted ?? 0;

  return (
    <Workflow name="large-project">
      <Loop until={currentSprint >= 25} maxIterations={25} onMaxReached="return-last">
        <Sequence>
          <Discover />
          {/* ... ticket pipelines ... */}
          <Task id="sprint-tracker" output={outputs.output}>
            {{ sprintsCompleted: currentSprint + 1 }}
          </Task>
        </Sequence>
      </Loop>
    </Workflow>
  );
});
```

## Next Steps

- [Implement-Review Loop](/guides/review-loop) -- The review loop pattern.
- [Model Selection](/guides/model-selection) -- Models for discovery vs. implementation.
- [Best Practices](/guides/best-practices) -- General workflow design guidelines.

---

## VCS Integration

> How Smithers integrates with JJ (Jujutsu) and Git for filesystem snapshots, worktree management, and revert support.
> Source: https://smithers.sh/guides/vcs

Smithers integrates with [JJ (Jujutsu)](https://github.com/martinvonz/jj) and Git to record filesystem snapshots at each task completion. This enables reverting to the exact workspace state after any attempt.

## VCS Detection

Smithers walks up the directory tree from `rootDir` looking for `.jj` or `.git`. When both exist in the same directory (a colocated repo), Smithers prefers `.jj` and uses JJ semantics. Pure Git repos work without JJ installed.

## VCS Pointer Flow

When a supported VCS is detected:

1. A task executes, making filesystem changes via agent tools.
2. The task completes (success or failure).
3. Smithers captures the current revision into `_smithers_attempts.jj_pointer`:
   - **JJ**: the change ID from `jj log -r @ --template change_id`
   - **Git**: the commit SHA from `git rev-parse HEAD`
4. The next task continues from this point.

Each attempt gets its own pointer.

The revision is also recorded on the run row as `vcs_revision` so Smithers can detect workspace drift between a run's start and a later resume — if the revision changed, the engine warns of a potential durability mismatch.

## Recorded Data

| Column | Type | Description |
|---|---|---|
| `jj_pointer` | text (nullable) | JJ change/operation ID after attempt completion. `null` if JJ unavailable. |

```bash
sqlite3 smithers.db "SELECT run_id, node_id, iteration, attempt, jj_pointer FROM _smithers_attempts WHERE run_id = '<id>';"
```

```
smth_a1b2|analyze|0|1|zqkopwvn
smth_a1b2|fix|0|1|xrlmqkts
smth_a1b2|fix|0|2|ynpwzrmv
smth_a1b2|report|0|1|kutswxqp
```

## Revert

```bash
smithers revert workflow.tsx \
  --run-id <run-id> \
  --node-id <node-id> \
  --attempt <attempt-number> \
  --iteration <iteration-number>
```

```bash
smithers revert workflow.tsx --run-id smth_a1b2 --node-id fix --attempt 1 --iteration 0
```

Restores the filesystem to the exact post-attempt state. Emits `RevertStarted` and `RevertFinished` events.

## Without VCS

- `jj_pointer` is `null` for all attempts.
- `revert` fails with an error.
- All other Smithers functionality is unaffected.

JJ is optional. Install only if revert support is needed. Git-only repos also provide pointer tracking and worktree support without JJ.

## Setup

```bash
brew install jj
```

```bash
# New JJ repo
cd /path/to/my-project
jj git init

# Colocate with existing Git repo
jj git init --colocate
```

Smithers auto-detects JJ and starts recording pointers.

## Programmatic Helpers

Smithers exports helpers for running raw `jj` commands, checking repo status, reading/restoring pointers, and managing JJ workspaces. See [VCS Helper Reference](/reference/vcs-helpers).

## Worktrees

Smithers can isolate each workflow run in its own worktree (a separate checkout sharing the same object store). This lets multiple runs modify the filesystem concurrently without stepping on each other.

### Git Worktrees

When `rootDir` points to a Git repository, Smithers calls `git worktree add` to create a new working tree at `worktreePath`. A named branch (`-B`) is created from the best available base ref:

1. `baseBranch` (if configured)
2. `origin/<baseBranch>`
3. `main` / `origin/main`
4. `HEAD`

It tries each in turn and uses the first that succeeds.

### JJ Workspaces

For JJ repos, Smithers calls `workspaceAdd` to create a JJ workspace at `worktreePath`. When a `branch` is supplied, it runs `jj bookmark set <branch> -r @` to point a bookmark at the new workspace's working copy.

You can create a workspace at a specific revision:

```ts
import { workspaceAdd } from "smithers-orchestrator/vcs";

await workspaceAdd("feature-fix", "/workspaces/feature-fix", {
  cwd: "/my-repo",
  atRev: "main",   // start the workspace at the tip of main
});
```

`workspaceAdd` tries multiple invocation styles (`jj workspace add --name`, positional, `--wc-path`) to stay compatible across JJ versions. Stale workspaces with the same name are forgotten before creating a new one.

### Rebase on Resume

When Smithers resumes a workflow that already has a worktree, it syncs the worktree to the current tip of the base branch before continuing:

- **JJ**: `jj git fetch` then `jj rebase -d <base>`
- **Git**: `git fetch origin` then `git rebase origin/<base>`

If the rebase fails, Smithers logs a warning and continues anyway — the resume is not blocked.

The base branch defaults to `main` when no `baseBranch` is configured.

## Running Workflows at a Specific Revision

When a run is started, Smithers records the current VCS revision (`vcs_revision` on the run row). On resume, it checks the current revision against the stored one. A mismatch emits a warning but does not block the resume.

This revision snapshot is available in the database:

```bash
sqlite3 smithers.db "SELECT run_id, vcs_type, vcs_root, vcs_revision FROM _smithers_runs WHERE run_id = '<id>';"
```

## JJ Operation ID Tracking

For JJ repos, the pointer stored in `jj_pointer` is the JJ **change ID** (the stable identifier that persists across amends and rebases), not the operation ID. This means:

- The pointer survives `jj amend`, `jj rebase`, and other history-rewriting commands.
- `jj restore --from <change_id>` reliably restores the working copy to the exact post-attempt state.

The change ID is read via `jj log -r @ --no-graph --template change_id`.

## Cache Key Integration

VCS pointers are included in the cache key when caching is enabled (`<Workflow cache>` or `{ cache: true }`):

| Component |
|---|
| Workflow name + nodeId |
| Prompt text or static payload |
| Model ID and parameters |
| Tool allowlist and versions |
| Output schema signature |
| **VCS pointer (JJ change ID or Git SHA)** |

Workspace changes invalidate cached results. Returning to a previous state (same pointer) reuses the cached result.

## Next Steps

- [Resumability](/guides/resumability) -- Crash recovery and state persistence.
- [Caching](/concepts/caching) -- Cache key mechanics.
- [CLI Reference](/cli/overview) -- All CLI commands including `revert`.

---

## Terminal UI (TUI)

> Chat-first terminal control plane for orchestrating, monitoring, and steering Smithers workflows.
> Source: https://smithers.sh/guides/tui

```bash
smithers tui
```

The TUI is a chat-first orchestration console. The default surface is a unified activity feed and composer, not a dashboard. Monitoring, approvals, and telemetry are visible without leaving the main workspace.

## Product stance

Smithers TUI complements the CLI. Every meaningful UI action maps to a Smithers API or CLI operation, preserving the same durable, scriptable mental model.

The TUI is **not** a full replacement for Claude Code, Codex CLI, Gemini CLI, or Amp. It is not a direct-edit harness by default. It is the orchestration layer above harnesses and API providers.

## Shell layout

The shell has four regions: workspace rail, activity feed, inspector, and composer.

```text
Smithers  repo: api  workspace: auth-fix  profile: Claude+SDK  mode: operator  2 runs  1 approval   Ctrl+O actions

|  auth refactor        [CC]  .1         12:41  You       Build a reusable Smithers workflow for auth fixes.
   docs sync            [AI]   v         12:41  Smithers  Plan:
!  pr review            [SM]  A1                   - inspect existing .smithers/workflows
   incident triage      [GM]   x                   - factor shared retry and review steps
                                                   - launch #auth-fix over current diff

                                              12:42  Run       auth-fix  a93f  running  validate -> patch  3/7
                                              12:43  Tool      smithers.workflows.read  .smithers/workflows/review-pr.tsx  18ms
                                              12:44  Approval  Push generated patch to workspace branch?
                                                               [Enter] open   [a] approve   [d] deny
                                              12:45  Artifact  .smithers/workflows/auth-fix.tsx

[#auth-fix] [@src/auth.ts] [@README.md]  Build it to be reusable, not one-off.
budget 18k ctx   Enter send   Alt+Enter queue   Ctrl+G editor
```

### Adaptive layouts

| Width | Behavior |
| --- | --- |
| >= 140 cols | Full three-column layout. Left rail 24 cols, right inspector 36--42 cols. |
| 100--139 cols | Inspector narrows to 28--32 cols. Less metadata in workspace rows. |
| 80--99 cols | Inspector becomes a toggleable overlay. Feed is the dominant pane. |
| < 80 cols | Single-pane mode. Workspace switcher and inspector are modal overlays. |

## Workspaces

A workspace is the top-level unit of activity. It contains a title, repo/cwd, current provider profile, mode, feed history, queued messages, linked runs, pinned context, and approval state.

The left rail shows all open workspaces. Each row displays:

- State marker (`|` active, `!` attention, `.` running)
- Title
- Provider tag (`[CC]`, `[SM]`, `[AI]`, `[GM]`, `[CX]`) — progressively disclosed (hidden until hover/focus in standard layouts)
- Compound Status badge (combines unread/approval into clear priority icons)

```text
|  auth refactor        [CC]  .1
   docs sync            [AI]   v
!  pr review            [SM]  A1
   incident triage      [GM]   x
```

Workspace actions: create, switch, close, archive, rename, pin, duplicate, fork from current.

Switching workspaces takes under 100ms. Active run and approval badges update without manual refresh.

## Activity feed

The center pane is a unified activity feed that mixes all orchestration activity:

| Item type | Display |
| --- | --- |
| User | Compact text with optional attachment pills |
| Assistant | Markdown with code blocks, collapsible long sections |
| Tool | Collapsed by default, one-line summary with name/target/status/duration |
| Run | Compact badge with workflow name, run ID, step, elapsed, progress |
| Approval | Detaches from feed into an Action Bar above the composer to prevent scrolling off-screen |
| Artifact | File name, type, source workflow/run, open/diff/copy affordances |
| Diff | Structured diff card |
| Error | Red label, compact summary first, stack collapsed underneath |
| Summary | Assistant-generated summaries for long activity blocks |

The feed streams incrementally and auto-scrolls unless you manually scroll away. A `LIVE` / `PAUSED` indicator appears in the feed header. To prevent scroll-blindness, active long-running workflows pin a sticky status header tracking progress, and related tool events are grouped with vertical ASCII spines.

### Empty states

New workspaces avoid the "blank terminal" syndrome by rendering a non-persistent Welcome Bento Board in the feed area. It displays current repo git status and proposes 3 suggested actions based on the repository heuristcs (e.g. noticing a `package.json` and suggesting a test workflow). This board naturally scrolls away once feed items occur.

## Inspector

The right rail shows details for the currently selected feed item. 

The inspector is a dynamic, context-sensitive precision surface. Instead of showing persistent empty tabs, the pane's title and contents morph based entirely on the selection. A breadcrumb (e.g., `Inspector • Run a93f` or `Inspector • src/auth.ts`) grounds the user.

Depending on the selection, the inspector can render:
- **Run** -- run graph, status, steps, cost
- **Context** -- pinned context items, token budget
- **Workflow** -- schema, last runs, provider hints
- **Diff** -- file diffs
- **Logs** -- timestamped lifecycle events
- **Details** -- raw output, structured output, scorer results

The inspector reacts to the current feed selection immediately. Selecting a run item shows the run graph. Selecting a workflow mention shows schema and last runs. Selecting an attachment shows preview and token estimate.

## Composer

The composer is a small command desk at the bottom of the shell.

```text
[#review-pr] [@src/auth.ts] [@README.md] [+2]
Build a reusable auth-fix workflow and run it against current diff.
budget 18k ctx    Enter send    Alt+Enter queue    Ctrl+G editor
```

Features:
- Multiline input (auto-grows up to 6 rows, then scrolls)
- `@` unified mentions for attaching files, directories, images, workspaces, sessions, and runs
- `#` invokes workflows (opens a fuzzy workflow picker)
- Slash commands (`/run`, `/workflows`, `/approvals`, etc.)
- Queued follow-up messages with `Alt+Enter`
- Large paste guard -- detects large paste and offers attach-as-file, inline, summarize, or cancel
- Draft preserved while switching workspaces

## Keyboard model

### Global

| Key | Action |
| --- | --- |
| `Ctrl+O` | Open global command palette |
| `Tab` / `Shift+Tab` | Cycle focus: workspace rail, feed, inspector, composer |
| `Esc` | Dismiss overlay, abort transient action, return focus toward composer |
| `?` | Show shortcuts/help for current context |
| `.` | Open contextual action menu for selected item |
| `/` | Search current pane (when composer is not focused) |
| `Ctrl+L` | Provider / model / profile picker |
| `Ctrl+R` | Prompt history search |
| `Ctrl+G` | Open composer in external editor |

### Composer

| Key | Action |
| --- | --- |
| `Enter` | Send |
| `Alt+Enter` | Queue follow-up |
| `Shift+Enter` / `Ctrl+J` | Newline |
| `@` | Unified context attach (file/image/directory/run/session) |
| `#` | Invoke workflow |
| `Ctrl+A` / `Ctrl+E` | Start / end of line |
| `Alt+B` / `Alt+F` | Word backward / forward |
| `Ctrl+W` / `Ctrl+U` / `Ctrl+K` | Kill word / line before / line after |

### Feed and lists

| Key | Action |
| --- | --- |
| `Up` / `Down` or `j` / `k` | Move selection |
| `g` / `G` | Jump to top / bottom |
| `PageUp` / `PageDown` | Page |
| `Space` | Expand / collapse selected item |
| `Enter` | Default action (open detail view) |
| `v` | Toggle verbose view |
| `o` | Open artifact/diff/log in pager or external viewer |
| `/` | Filter/search within the current pane |

### Destructive actions

No global single-key kill/approve while unfocused. Approval actions only appear inside the approval context. Confirmation dialogs always show the exact target. A per-workspace "always allow" path exists for repetitive safe actions.

## Modes

The TUI supports three operating modes, switchable via `/mode` or `Ctrl+L`:

### Operator (default)

The assistant prefers creating, modifying, and reusing Smithers workflows over direct file edits:

1. Inspect existing `.smithers/workflows/` first
2. Reuse or refactor shared Smithers components
3. Scaffold or edit workflows/scripts in `.smithers/`
4. Launch durable runs for non-trivial work
5. Monitor and report results
6. Use cheaper API providers for broad analysis
7. Escalate to harness-backed workers only when needed
8. Ask before direct edits outside `.smithers/`

### Plan

Read-only. No file writes, no destructive shell, no workflow execution without confirmation.

### Direct

Direct repo edits allowed. Still encourages Smithers scripts where useful but does not block one-off edits.

## Provider routing

A provider profile routes work by task class. Example:

- Repo scan -> AI SDK / cheap model
- Workflow generation -> API strong model
- Repo implementation -> Claude Code or Codex harness
- Final summary -> cheap model

The current provider and routing policy are visible in the top line and editable via `/provider` and `/profiles`.

## Workflow catalog

Type `#` in the composer to open the workflow picker:

```text
+-- Workflows ----------------------------------------------------------------+
| > review-pr            PR review against current diff         last v 4m      |
|   auth-fix             Reusable auth remediation flow         last x 1h      |
|   docs-refresh         Refresh docs and changelog             last v 1d      |
|                                                                              |
| review-pr                                                                    |
|   input: { target?: string, diff?: boolean, push?: boolean }                |
|   providers: SDK analyze -> Claude Code patch -> SDK summary                 |
|   tags: review, reusable, repo                                              |
+------------------------------------------------------------------------------+
```

The catalog auto-discovers workflows from `.smithers/workflows/`. Features:
- Favorites and recents
- Fuzzy search by ID, tags, description, provider hints
- Input schema summary
- Last-run status, duration, success rate
- Launch form generated from schema when possible

## Run monitoring

Run cards in the feed show:
- Workflow name, run ID, provider
- Elapsed time, step count, progress bar
- Latest node, approval state
- Retries, failures, token/cost summary

The inspector supports deep run inspection:
- Overview and DAG/step graph
- Node attempts
- Logs and chat transcript
- Artifacts and scorer results
- Raw/structured output
- Retry, resume, and cancel actions

Navigate from a feed item to the deep run inspector in one action. Attach to any active run. Run state persists if the TUI exits.

## Notifications

Events that trigger notifications:
- Approval needed
- Run failed or completed
- Provider disconnected
- Queued message delivered

Notifications appear as in-app badges on the workspace rail. Terminal bell, OSC notifications, and desktop notifications are available. Notifications are suppressed when the relevant workspace is focused.

## Slash commands

### Core

| Command | Purpose |
| --- | --- |
| `/help` | Help |
| `/new` | New workspace |
| `/resume` | Resume workspace |
| `/tree` | Session tree |
| `/compact` | Compact feed |
| `/clear` | Clear feed |
| `/export` | Export feed to markdown/JSON |
| `/theme` | Switch theme |
| `/settings` | Settings |

### Smithers

| Command | Purpose |
| --- | --- |
| `/workflows` | Open workflow catalog |
| `/run` | Launch a workflow |
| `/runs` | Show live runs |
| `/approvals` | Show pending approvals |
| `/telemetry` | Telemetry board |
| `/triggers` | Trigger manager |
| `/datagrid` | SQL query browser |
| `/docs` | Search Smithers docs |
| `/attach-run` | Attach to a run |
| `/resume-run` | Resume a run |
| `/cancel-run` | Cancel a run |

### Provider

| Command | Purpose |
| --- | --- |
| `/provider` | Switch provider profile |
| `/mode` | Switch mode (operator/plan/direct) |
| `/budget` | Token budget |
| `/profiles` | Manage provider profiles |

### Context

| Command | Purpose |
| --- | --- |
| `/attach` | Attach file/context |
| `/detach` | Remove context |
| `/history` | Prompt history |
| `/editor` | Open external editor |

## Command palette

Press `Ctrl+O` to open the global command palette. All slash commands and contextual actions are searchable here. Natural language also works -- type what you want and the assistant interprets it.

## Persistence and recovery

The TUI survives:
- Accidental exits (workspace restore on relaunch)
- TTY resize
- Provider disconnects (reconnect automatically)
- Broker crashes (workflow runs continue independently)
- Large paste mistakes (paste guard dialog)

Persisted state includes: last active workspace, composer draft, attachment chips, inspector tab, follow mode, selected feed entry, pending queued messages, and broker reconnect cursor.

## Monitoring with Claude Code

Smithers persists all state to SQLite and exposes it through the CLI, so Claude Code can query status and report progress without interrupting execution.

Set up a recurring health check with `/cron`:

```
/cron 10m Check the smithers workflow running in this directory.
Run `smithers ps` to see active runs, then `smithers inspect <run-id>`
for the latest run. Summarize what tasks have completed, what's currently
running, any failures, and overall progress. Keep it brief.
```

With `--hot`, Smithers watches for file changes and hot-reloads the workflow definition. Claude Code can edit prompts, swap agents, or restructure the JSX tree mid-run. In-flight tasks keep their original code; new tasks use the updated definition.

Other patterns:

- **Ad hoc inspection** -- Read the Smithers database and explain a failed run
- **Approval handling** -- Run `smithers approve` or `smithers deny` based on criteria
- **Live tuning** -- With `--hot`, tweak prompts or switch models mid-run
- **Post-run analysis** -- Summarize outputs and suggest next steps

## Beyond the terminal

### Burns

[Burns](https://github.com/l3wi/burns) is a workspace-first local control plane for Smithers. React web app, ElectroBun desktop shell, and headless CLI for authoring, running, and supervising workflows. See [Ecosystem](/integrations/ecosystem).

### JJHub Cloud

[jjhub.tech](https://jjhub.tech) will have first-class Smithers support for hosted workflows with scheduling, observability, and team collaboration.

## Next steps

- [CLI Reference](/cli/overview) -- All CLI commands including `smithers tui`.
- [Monitoring & Logs](/guides/monitoring-logs) -- Observability with Grafana and Prometheus.
- [Debugging](/guides/debugging) -- Diagnosing workflow issues.

---

## Debugging

> Inspect runs, diagnose failures, and query internal state using the CLI, logs, and SQLite.
> Source: https://smithers.sh/guides/debugging

Three levels of inspection: CLI commands, NDJSON log files, and direct SQLite queries.

All CLI examples on this page use `bunx smithers-orchestrator ...`. The published package is `smithers-orchestrator`, not `smithers`.

## CLI Inspection

### inspect

```bash
bunx smithers-orchestrator inspect <id>
```

Shows run metadata, node statuses, approvals, and loop state.

### logs

```bash
bunx smithers-orchestrator logs <id> --tail 10
```

Shows persisted lifecycle events. Add `--follow` to keep tailing live events. For raw render frames, query `_smithers_frames` or use [renderFrame](/runtime/render-frame).

### chat

```bash
bunx smithers-orchestrator chat <id> --tail 5
```

Shows the most recent agent chat blocks for a run. Omit `<id>` to inspect the latest run.

### why

```bash
bunx smithers-orchestrator why <id>
```

Explains why a run is currently blocked or paused.

### node

```bash
bunx smithers-orchestrator node <node-id> --run-id <id>
```

Shows enriched node details for retries, tool calls, and latest output for a specific task.

### ps

```bash
bunx smithers-orchestrator ps --limit 20
bunx smithers-orchestrator ps --status running
bunx smithers-orchestrator ps --status failed
```

### graph

```bash
bunx smithers-orchestrator graph workflow.tsx --run-id <id>
bunx smithers-orchestrator graph workflow.tsx --input '{"description": "Fix bugs"}'
```

Shows task dependency structure. The second form previews the graph without running.

## NDJSON Logs

```
.smithers/executions/<runId>/logs/stream.ndjson
```

Each line is a JSON-encoded `SmithersEvent`.

```bash
tail -f .smithers/executions/<runId>/logs/stream.ndjson
```

Filter by event type:

```bash
grep '"type":"Node' .smithers/executions/<runId>/logs/stream.ndjson
grep '"type":"NodeFailed"' .smithers/executions/<runId>/logs/stream.ndjson
grep '"type":"ToolCall' .smithers/executions/<runId>/logs/stream.ndjson
```

Parse with `jq`:

```bash
tail -5 .smithers/executions/<runId>/logs/stream.ndjson | jq .
cat .smithers/executions/<runId>/logs/stream.ndjson | jq 'select(.type == "NodeFinished" or .type == "NodeFailed") | {nodeId, type}'
```

Custom log location or disable:

```bash
bunx smithers-orchestrator up workflow.tsx --log-dir ./my-logs
bunx smithers-orchestrator up workflow.tsx --log false
```

## SQLite Inspection

### Internal Tables

| Table | Purpose |
|---|---|
| `_smithers_runs` | Run metadata: status, timestamps |
| `_smithers_nodes` | Per-node state: status, iteration, attempt count |
| `_smithers_attempts` | Per-attempt: status, times, errors, JJ pointer |
| `_smithers_frames` | Render frame snapshots (XML of JSX tree) |
| `_smithers_approvals` | Approval decisions |
| `_smithers_cache` | Cached task results |
| `_smithers_tool_calls` | Tool invocations: name, args, result, duration |
| `_smithers_events` | All events with sequence numbers and JSON payloads |
| `_smithers_ralph` | Loop iteration state |

### Queries

```bash
# Run status
sqlite3 smithers.db "SELECT run_id, status, created_at_ms, updated_at_ms FROM _smithers_runs ORDER BY created_at_ms DESC LIMIT 5;"

# Node states
sqlite3 smithers.db "SELECT node_id, status, iteration FROM _smithers_nodes WHERE run_id = '<id>' ORDER BY updated_at_ms;"

# Failed attempts
sqlite3 smithers.db "SELECT node_id, attempt, status, error_message FROM _smithers_attempts WHERE run_id = '<id>' AND status = 'failed';"

# Tool calls
sqlite3 smithers.db "SELECT tool_name, arguments, duration_ms FROM _smithers_tool_calls WHERE run_id = '<id>' AND node_id = '<node-id>';"

# Events
sqlite3 smithers.db "SELECT seq, type, payload_json FROM _smithers_events WHERE run_id = '<id>' ORDER BY seq LIMIT 50;"

# Approvals
sqlite3 smithers.db "SELECT node_id, approved, decided_by, note FROM _smithers_approvals WHERE run_id = '<id>';"

# Loop state
sqlite3 smithers.db "SELECT * FROM _smithers_ralph WHERE run_id = '<id>';"

# Output tables
sqlite3 smithers.db "SELECT * FROM analysis WHERE run_id = '<id>';"
sqlite3 smithers.db "SELECT * FROM report WHERE run_id = '<id>';"
```

## Common Issues

### Stuck at "waiting-approval"

```bash
bunx smithers-orchestrator why <id>
bunx smithers-orchestrator inspect <id>
bunx smithers-orchestrator approve <id> --node <node-id>
# or: bunx smithers-orchestrator deny <id> --node <node-id>
bunx smithers-orchestrator up workflow.tsx --run-id <id> --resume
```

### Duplicate task IDs

```
Error: Duplicate Task id detected: "analyze"
```

Every `<Task>` needs a globally unique `id`. For dynamic tasks, derive IDs from unique identifiers: `id={`${item.id}:process`}`.

### Missing output rows

`ctx.output()` throws "No output found" when the task has not completed, failed, or its `id` changed between renders. Use `ctx.outputMaybe()` for conditional access:

```tsx
const result = ctx.outputMaybe("analysis", { nodeId: "analyze" });
if (result) {
  // safe to use
}
```

### Task keeps retrying

Check for schema validation errors, timeouts, or tool failures:

```bash
bunx smithers-orchestrator node <node-id> --run-id <id> --attempts --tools
grep "NodeRetrying" .smithers/executions/<runId>/logs/stream.ndjson | jq .
sqlite3 smithers.db "SELECT node_id, attempt, error_message FROM _smithers_attempts WHERE run_id = '<id>' AND status = 'failed';"
```

### Stale in-progress tasks

Tasks in-progress for over 15 minutes are auto-cancelled and retried on resume:

```bash
bunx smithers-orchestrator up workflow.tsx --run-id <id> --resume
```

## Next Steps

- [Monitoring & Logs](/guides/monitoring-logs) -- Live monitoring with events and SSE.
- [Resumability](/guides/resumability) -- How resume handles stale tasks.
- [CLI Reference](/cli/overview) -- All CLI commands and options.

---

## Observability

> Export Smithers spans and metrics over OTLP, inspect persisted events, and add observability with Effect.
> Source: https://smithers.sh/guides/monitoring-logs

Four observability surfaces:

- Persisted lifecycle events
- Structured logs
- OpenTelemetry spans
- Effect metrics exported over OTLP

The runtime instruments workflow runs, nodes, tools, cache, approvals, database access, hot reloads, HTTP requests, and JJ commands automatically.

## Enable OpenTelemetry Export

```ts
import { Context, Effect, Layer, Schema } from "effect";
import { Model } from "@effect/sql";
import { createSmithersObservabilityLayer } from "smithers-orchestrator/observability";

const AppLive = Layer.mergeAll(
  AgentLive,
  createSmithersObservabilityLayer({
    enabled: true,
    endpoint: "http://localhost:4318",
    serviceName: "bugfix-worker",
    logFormat: "json",
  }),
);
```

Environment-based configuration:

```bash
export SMITHERS_OTEL_ENABLED=1
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
export OTEL_SERVICE_NAME=bugfix-worker
export SMITHERS_LOG_FORMAT=json
export SMITHERS_LOG_LEVEL=info
```

## Local Prometheus + Grafana Stack

```bash
docker compose -f observability/docker-compose.otel.yml up
```

| Service | Endpoint |
|---|---|
| OTLP collector | `http://localhost:4318` |
| Prometheus | `http://localhost:9090` |
| Grafana | `http://localhost:3000` |
| Tempo | `http://localhost:3200` |

The collector exports metrics on `:8889`; Prometheus scrapes it; Grafana ships with a pre-provisioned Smithers dashboard.

The built-in HTTP server also exposes `GET /metrics` in Prometheus text exposition format.

## Direct Prometheus Endpoint

```bash
curl http://localhost:7331/metrics
```

For custom servers:

```ts
import {
  prometheusContentType,
  renderPrometheusMetrics,
} from "smithers-orchestrator/observability";

const body = renderPrometheusMetrics();
// return body with Content-Type: prometheusContentType
```

## Built-in Metrics

```ts
import { smithersMetrics } from "smithers-orchestrator/observability";
```

| Category | Metrics |
|---|---|
| Runs | `runsTotal`, `activeRuns`, `runsResumedTotal`, `runsFinishedTotal`, `runsFailedTotal`, `runsCancelledTotal`, `runsContinuedTotal` |
| Nodes | `nodesStarted`, `nodesFinished`, `nodesFailed`, `activeNodes`, `nodeRetriesTotal` |
| Duration | `nodeDuration`, `attemptDuration`, `runDuration` |
| Tools | `toolCallsTotal`, `toolDuration`, `toolCallErrorsTotal`, `toolOutputTruncatedTotal` |
| Cache | `cacheHits`, `cacheMisses` |
| Database | `dbQueryDuration`, `dbRetries` |
| Database transactions | `dbTransactionDuration`, `dbTransactionRetries`, `dbTransactionRollbacks` |
| Scheduler | `schedulerQueueDepth`, `schedulerConcurrencyUtilization`, `schedulerWaitDuration` |
| Approvals | `approvalsRequested`, `approvalsGranted`, `approvalsDenied`, `approvalPending`, `approvalWaitDuration` |
| Async / external wait | `externalWaitAsyncPending` |
| Timers | `timersCreated`, `timersFired`, `timersCancelled`, `timersPending`, `timerDelayDuration` |
| Tokens | `tokensInputTotal`, `tokensOutputTotal`, `tokensCacheReadTotal`, `tokensCacheWriteTotal`, `tokensReasoningTotal`, `tokensInputPerCall`, `tokensOutputPerCall` |
| HTTP | `httpRequests`, `httpRequestDuration` |
| Hot reload | `hotReloads`, `hotReloadFailures`, `hotReloadDuration` |
| Sandbox | `sandboxCreatedTotal`, `sandboxCompletedTotal`, `sandboxActive`, `sandboxDurationMs` |
| Sandbox transport | `sandboxTransportDurationMs`, `sandboxBundleSizeBytes`, `sandboxPatchCount` |
| Prompt size | `promptSizeBytes` |
| Response size | `responseSizeBytes` |
| Run ancestry depth | `runsAncestryDepth` |
| Continue-as-new state | `runsCarriedStateBytes` |
| Scorer events | `scorerEventsStarted`, `scorerEventsFinished`, `scorerEventsFailed` |
| VCS | `vcsDuration` |
| Process | `processUptimeSeconds`, `processMemoryRssBytes`, `processHeapUsedBytes` |
| Errors | `errorsTotal`, `eventsEmittedTotal` |

Emitted by the engine automatically.

## Observability as a Dependency

```ts
import { Context, Effect, Layer, Metric } from "effect";
import { SmithersObservability } from "smithers-orchestrator/observability";

const notificationsSent = Metric.counter("app.notifications.sent");

class Notifications extends Context.Tag("Notifications")<
  Notifications,
  {
    readonly send: (ticketId: string) => Effect.Effect<void>;
  }
>() {}

const NotificationsLive = Layer.effect(
  Notifications,
  Effect.gen(function* () {
    const obs = yield* SmithersObservability;

    return {
      send: (ticketId) =>
        obs.withSpan(
          "notifications:send",
          Effect.gen(function* () {
            yield* obs.annotate({
              ticketId,
              channel: "slack",
            });
            yield* Metric.increment(notificationsSent);
            yield* Effect.logInfo(`sending notification for ${ticketId}`);
          }),
          { component: "notifications" },
        ),
    };
  }),
);
```

Composition model:

- Call `createSmithersObservabilityLayer(...)` once in the app layer
- Depend on `SmithersObservability` in services that need Smithers-scoped spans
- Use standard Effect `Metric` primitives for custom application metrics

## Events and Logs

OTLP does not replace the durable event log. Every run still persists events to:

```txt
.smithers/executions/<runId>/logs/stream.ndjson
```

```bash
tail -f .smithers/executions/<runId>/logs/stream.ndjson
sqlite3 smithers.db "SELECT seq, type, payload_json FROM _smithers_events WHERE run_id = '<id>' ORDER BY seq DESC LIMIT 20;"
```

See [Events](/runtime/events) for the event model.

## Server Instrumentation

HTTP server requests are instrumented:

- `smithers.http.requests` counter
- `smithers.http.request_duration_ms` histogram
- Request, workflow-load, and body-read spans flow through the OTLP layer

## Next Steps

- [Debugging](/guides/debugging) -- CLI and SQLite failure diagnosis.
- [Events Reference](/runtime/events) -- Full event type definitions.
- [Server Integration](/integrations/server) -- HTTP server and SSE endpoint.

---

## Time Travel Quickstart

> Replay a failed run, diff two runs, fork and experiment, view the timeline.
> Source: https://smithers.sh/guides/time-travel-quickstart

This guide walks through the four core time travel operations: replaying a failed run, diffing two snapshots, forking a run for experimentation, and viewing the execution timeline.

## Prerequisites

- A Smithers project with at least one completed or failed run
- The `smithers` CLI installed (`bunx smithers`)

## Replay a Failed Run

Suppose you ran a workflow and it failed at the `implement` task:

```bash
bunx smithers up workflow.tsx --input '{"ticket":"AUTH-42"}' --run-id run-001
# [00:02:15] -> analyze (attempt 1, iteration 0)
# [00:02:45] checkmark analyze (attempt 1)
# [00:02:46] -> implement (attempt 1, iteration 0)
# [00:05:30] x implement (attempt 1): LLM returned invalid code
# [00:05:31] x Run failed
```

The run captured snapshots at every frame. To replay from the frame just before `implement` started:

```bash
bunx smithers replay workflow.tsx --run-id run-001 --frame 2 --node implement
```

This creates a new run forked from `run-001` at frame 2 with the `implement` node reset to pending. Everything before it (the `analyze` output) is preserved.

To replay with modified input:

```bash
bunx smithers replay workflow.tsx --run-id run-001 --frame 2 \
  --input '{"ticket":"AUTH-42","hint":"Use OAuth2 instead of session tokens"}' \
  --label "oauth2-attempt"
```

The `--label` flag gives this fork a human-readable name for the timeline.

## Diff Two Snapshots

Compare what changed between two frames in the same run:

```bash
bunx smithers diff workflow.tsx run-001:1 run-001:3
```

Output shows which nodes changed state, which outputs were added or modified, and whether the input or VCS pointer changed.

Compare the final states of two different runs (e.g., the original and a replay):

```bash
bunx smithers diff workflow.tsx run-001 run-002
```

For machine-readable output:

```bash
bunx smithers diff workflow.tsx run-001 run-002 --json
```

## Fork a Run

Forking creates a new run from a snapshot without immediately executing it. Use this when you want to set up the fork first and run it later:

```bash
# Fork from frame 2, reset the implement node
bunx smithers fork workflow.tsx --run-id run-001 --frame 2 \
  --reset-node implement \
  --label "experiment-1"

# The fork is created but not running. Start it:
bunx smithers up workflow.tsx --run-id <forked-run-id> --resume
```

To fork and run in one step:

```bash
bunx smithers fork workflow.tsx --run-id run-001 --frame 2 \
  --reset-node implement \
  --label "experiment-1" \
  --run
```

## View the Timeline

See the execution history for a run:

```bash
bunx smithers timeline run-001
```

```
run-001
  Frame 0  2024-03-15T10:00:00Z  (initial)
  Frame 1  2024-03-15T10:00:01Z  analyze: pending -> running
  Frame 2  2024-03-15T10:02:45Z  analyze: finished
  Frame 3  2024-03-15T10:02:46Z  implement: pending -> running
  Frame 4  2024-03-15T10:05:30Z  implement: failed
```

To include all forks in a tree view:

```bash
bunx smithers timeline run-001 --tree
```

```
run-001
  Frame 0  (initial)
  Frame 1  analyze: pending -> running
  Frame 2  analyze: finished
  |-- run-002 [experiment-1] (forked at frame 2)
  |     Frame 0  (from run-001:2)
  |     Frame 1  implement: pending -> running
  |     Frame 2  implement: finished
  Frame 3  implement: pending -> running
  Frame 4  implement: failed
```

JSON output for scripting:

```bash
bunx smithers timeline run-001 --tree --json
```

## Restoring VCS State

If your project uses jj and the run recorded VCS pointers, you can replay with filesystem state restored:

```bash
bunx smithers replay workflow.tsx --run-id run-001 --frame 2 \
  --node implement \
  --restore-vcs
```

This creates a jj workspace at the revision that was active at frame 2 and runs the workflow there. The code running the workflow is identical to what ran originally.

---

## Evals Quickstart

> Add quality scoring to your Smithers workflow in under five minutes.
> Source: https://smithers.sh/guides/evals-quickstart

This guide walks you through adding scorers to an existing workflow. By the end you will have live scoring on every task run, with results visible in the CLI and TUI.

## Prerequisites

- A working Smithers workflow (see [Tutorial: Build a Workflow](/guides/tutorial-workflow))
- At least one `<Task>` with an agent

## Step 1: Import Scorers

```tsx
import {
  schemaAdherenceScorer,
  latencyScorer,
  relevancyScorer,
} from "smithers-orchestrator/scorers";
```

## Step 2: Attach Scorers to a Task

Add the `scorers` prop to any `<Task>`:

```tsx
<Task
  id="analyze"
  agent={claude}
  output={outputs.analysis}
  scorers={{
    schema: { scorer: schemaAdherenceScorer() },
    latency: { scorer: latencyScorer({ targetMs: 5000, maxMs: 30000 }) },
  }}
>
  <AnalysisPrompt />
</Task>
```

These two scorers are code-based and require no additional LLM calls.

## Step 3: Add LLM-based Scoring (Optional)

For LLM-as-judge evaluation, pass an agent to the scorer factory:

```tsx
import { AnthropicAgent } from "smithers-orchestrator";

const judge = new AnthropicAgent({
  model: "claude-sonnet-4-20250514",
});

<Task
  id="analyze"
  agent={claude}
  output={outputs.analysis}
  scorers={{
    schema: { scorer: schemaAdherenceScorer() },
    relevancy: {
      scorer: relevancyScorer(judge),
      sampling: { type: "ratio", rate: 0.2 },  // Score 20% of runs
    },
  }}
>
  <AnalysisPrompt />
</Task>
```

## Step 4: Run Your Workflow

```bash
smithers up workflow.tsx
```

If you are running a discovered workflow from `.smithers/workflows`, use `smithers workflow run <name>` instead.

Scorers run asynchronously after each task finishes. They never slow down your workflow.

## Step 5: View Scores

### CLI

```bash
# List all scores for a run
smithers scores <run_id>
```

Example output:

```
Scores for run abc123
┌──────────┬────────────────────┬───────┬───────────────────────────────┐
│ Node     │ Scorer             │ Score │ Reason                        │
├──────────┼────────────────────┼───────┼───────────────────────────────┤
│ analyze  │ Schema Adherence   │  1.00 │ Output matches schema         │
│ analyze  │ Latency            │  0.85 │ 7200ms (target: 5000ms)       │
│ analyze  │ Relevancy          │  0.92 │ Output directly addresses ... │
└──────────┴────────────────────┴───────┴───────────────────────────────┘
```

### TUI

Open the TUI with `smithers tui`, navigate to a task, and switch to the **Scores** tab to see per-task scoring results.

## Step 6: Custom Scorers

Build your own scorer with `createScorer`:

```ts
import { createScorer } from "smithers-orchestrator/scorers";

const wordCountScorer = createScorer({
  id: "word-count",
  name: "Word Count",
  description: "Scores based on output word count",
  score: async ({ output }) => {
    const words = String(output).split(/\s+/).length;
    const score = Math.min(words / 200, 1);
    return {
      score,
      reason: `Output contains ${words} words`,
    };
  },
});
```

## Step 7: LLM-as-Judge Custom Scorers

Use `llmJudge` to build custom LLM-based scorers:

```ts
import { llmJudge } from "smithers-orchestrator/scorers";

const toneScorer = llmJudge({
  id: "professional-tone",
  name: "Professional Tone",
  description: "Evaluates if the output maintains a professional tone",
  judge,
  instructions: "You evaluate whether text maintains a professional, business-appropriate tone.",
  promptTemplate: ({ input, output }) =>
    `Rate the professionalism of this response on a scale of 0-1.\n\nInput: ${String(input)}\n\nOutput: ${String(output)}\n\nRespond with a JSON object: { "score": <number>, "reason": "<explanation>" }`,
});
```

## Batch Evaluation

For testing and offline evaluation, use `runScorersBatch` directly:

```ts
import { runScorersBatch } from "smithers-orchestrator/scorers";

const results = await runScorersBatch(
  {
    myScorer: { scorer: schemaAdherenceScorer() },
  },
  {
    runId: "test-run",
    nodeId: "analyze",
    iteration: 0,
    attempt: 0,
    input: "Analyze this code",
    output: { summary: "The code is clean" },
    outputSchema: analysisSchema,
  },
  adapter,
);
```

---

## Voice Quickstart

> Set up text-to-speech and speech-to-text in a Smithers workflow.
> Source: https://smithers.sh/guides/voice-quickstart

## Prerequisites

Smithers ships with voice support built in. You need:

- An OpenAI API key (or another AI SDK-supported provider)
- `smithers-orchestrator` version 0.12.8 or later

## Install

No extra packages. The `ai` and `@ai-sdk/openai` dependencies are already included.

## Create a Voice Provider

The simplest provider wraps AI SDK models for batch TTS and STT:

```ts
import { createAiSdkVoice } from "smithers-orchestrator/voice";
import { openai } from "@ai-sdk/openai";

const voice = createAiSdkVoice({
  speechModel: openai.speech("tts-1"),
  transcriptionModel: openai.transcription("whisper-1"),
});
```

## Add Voice to a Workflow

Wrap tasks with the `<Voice>` component:

```tsx
import { Workflow, Task, Voice, createSmithers } from "smithers-orchestrator";
import { z } from "zod";

const { outputs, workflow } = createSmithers({
  transcript: z.object({ text: z.string() }),
  summary: z.object({ content: z.string() }),
});

export default (
  <Workflow>
    <Voice provider={voice} speaker="alloy">
      <Task id="transcribe" output={outputs.transcript} agent={myAgent}>
        Transcribe the audio input and return the text.
      </Task>
      <Task id="summarize" output={outputs.summary} agent={myAgent} dependsOn={["transcribe"]}>
        Summarize the transcript.
      </Task>
    </Voice>
  </Workflow>
);
```

## Use Composite Voice

Mix different providers for input and output:

```ts
import { createCompositeVoice, createAiSdkVoice } from "smithers-orchestrator/voice";
import { openai } from "@ai-sdk/openai";

const stt = createAiSdkVoice({
  transcriptionModel: openai.transcription("whisper-1"),
});

const tts = createAiSdkVoice({
  speechModel: openai.speech("tts-1"),
});

const voice = createCompositeVoice({
  input: stt,
  output: tts,
});
```

## Use Realtime Voice

For low-latency bidirectional audio, use the OpenAI Realtime provider:

```ts
import { createOpenAIRealtimeVoice } from "smithers-orchestrator/voice";

const realtime = createOpenAIRealtimeVoice({
  apiKey: process.env.OPENAI_API_KEY,
  model: "gpt-4o-mini-realtime-preview-2024-12-17",
  speaker: "alloy",
});

// Connect before use
await realtime.connect();

// Listen for events
realtime.on("speaking", (data) => {
  // handle audio output
});

realtime.on("writing", (data) => {
  // handle text transcription
});

// Send audio
await realtime.send(audioStream);

// Disconnect when done
realtime.close();
```

## Voice with Effect.ts

Use the Effect service layer for typed voice operations:

```ts
import { VoiceService, speak, listen } from "smithers-orchestrator/voice";
import { Effect } from "effect";

const program = Effect.gen(function* () {
  const text = yield* listen(audioStream);
  const audio = yield* speak(`The transcript says: ${text}`);
  return { text, audio };
}).pipe(Effect.provideService(VoiceService, voice));
```

## Supported Providers

Any provider supported by the Vercel AI SDK works with `createAiSdkVoice`:

| Provider | TTS | STT |
| --- | --- | --- |
| OpenAI | `openai.speech("tts-1")` | `openai.transcription("whisper-1")` |
| ElevenLabs | `elevenlabs.speech(...)` | `elevenlabs.transcription(...)` |
| Deepgram | -- | `deepgram.transcription("nova-3")` |
| Google | `google.speech(...)` | `google.transcription(...)` |

For realtime speech-to-speech, use `createOpenAIRealtimeVoice` directly.

---

## Memory Quickstart

> Set up cross-run memory, add recall to tasks, and use loop memory with Ralph.
> Source: https://smithers.sh/guides/memory-quickstart

## Prerequisites

- `smithers-orchestrator` version 0.14 or later
- An embedding model (OpenAI, Google, Cohere -- anything the AI SDK supports)

## Set Up the Memory Store

The memory store uses your workflow's existing SQLite database:

```ts
import { createSmithers } from "smithers-orchestrator";
import { createMemoryStore } from "smithers-orchestrator/memory";
import { z } from "zod";

const { outputs, workflow } = createSmithers({
  analysis: z.object({ summary: z.string(), score: z.number() }),
});

const store = createMemoryStore(workflow.db);
```

## Write and Read Facts

Facts are key-value pairs scoped to a namespace:

```ts
import type { MemoryNamespace } from "smithers-orchestrator/memory";

const ns: MemoryNamespace = { kind: "workflow", id: "code-review" };

await store.setFact(ns, "reviewer-preference", { style: "thorough", language: "typescript" });

const fact = await store.getFact(ns, "reviewer-preference");
console.log(JSON.parse(fact!.valueJson));
// { style: "thorough", language: "typescript" }
```

## Add Semantic Recall

Set up semantic memory by connecting the RAG vector store and an embedding model:

```ts
import { createSemanticMemory } from "smithers-orchestrator/memory";
import { createSqliteVectorStore } from "smithers-orchestrator/rag";
import { openai } from "@ai-sdk/openai";

const vectorStore = createSqliteVectorStore(workflow.db);
const embeddingModel = openai.embedding("text-embedding-3-small");
const semantic = createSemanticMemory(vectorStore, embeddingModel);
```

Store and retrieve by meaning:

```ts
await semantic.remember(ns, "The code-review workflow detected 3 critical bugs in the auth module");

const results = await semantic.recall(ns, "What bugs were found?", { topK: 3 });
for (const r of results) {
  console.log(r.score.toFixed(2), r.chunk.content);
}
```

## Add Memory to a Task

Use the `memory` prop on `<Task>` to automatically recall context before the agent runs and store output after:

```tsx
import { Task, Loop, Workflow } from "smithers-orchestrator";

const ns = { kind: "workflow" as const, id: "code-review" };

export default workflow(({ input }) => (
  <Workflow>
    <Task
      id="review"
      agent={reviewer}
      output={outputs.analysis}
      memory={{
        recall: { namespace: ns, topK: 3 },
        remember: { namespace: ns, key: "last-review" },
      }}
    >
      Review the code changes in {input.prUrl}
    </Task>
  </Workflow>
));
```

On the first run, recall finds nothing and the agent runs normally. On subsequent runs, the agent receives the most relevant past reviews as context.

## Loop Memory with Ralph

Build iterative workflows where each loop iteration learns from the previous ones:

```tsx
const ns = { kind: "workflow" as const, id: "iterative-improve" };

export default workflow(({ input }) => (
  <Workflow>
    <Loop until={done} maxIterations={5}>
      <Task
        id="improve"
        agent={improver}
        output={outputs.analysis}
        memory={{
          recall: { namespace: ns, topK: 5 },
          remember: { namespace: ns },
        }}
      >
        Improve the code based on previous feedback
      </Task>
    </Loop>
  </Workflow>
));
```

Each iteration stores its output and the next iteration recalls the most relevant prior outputs.

## Message History

Track ordered conversations across runs:

```ts
const thread = await store.createThread(ns, "PR #42 Review");

await store.saveMessage({
  threadId: thread.threadId,
  role: "assistant",
  contentJson: JSON.stringify({ text: "Found 3 issues in auth.ts" }),
  runId: "run-abc",
  nodeId: "review",
});

const messages = await store.listMessages(thread.threadId, 10);
const total = await store.countMessages(thread.threadId);

// Retrieve a thread by ID
const existing = await store.getThread(thread.threadId);

// Delete a thread and its messages
await store.deleteThread(thread.threadId);
```

## Processors

Run maintenance on stored memory:

```ts
import { TtlGarbageCollector, TokenLimiter, Summarizer } from "smithers-orchestrator/memory";

// Delete expired facts
const gc = TtlGarbageCollector();
await gc.process(store);

// Compress message history exceeding a token budget
const limiter = TokenLimiter(4000);
await limiter.process(store);

// Summarize old messages with an LLM
const summarizer = Summarizer(myAgent);
await summarizer.process(store);
```

## CLI

Inspect memory from the command line:

```bash
# List all facts in a namespace
smithers memory list workflow:code-review

# Semantic search
smithers memory recall "What bugs were found?" --namespace workflow:code-review
```

---

## RAG Quickstart

> Set up a RAG pipeline, ingest documents, and query from a workflow.
> Source: https://smithers.sh/guides/rag-quickstart

## Prerequisites

- `smithers-orchestrator` version 0.12.8 or later
- An OpenAI API key (or another AI SDK-supported embedding provider)

No extra packages needed. The `ai` and `@ai-sdk/openai` dependencies are already included.

## Create a Vector Store

The vector store uses your workflow's existing SQLite database:

```ts
import { createSqliteVectorStore } from "smithers-orchestrator/rag";
import { createSmithers } from "smithers-orchestrator";
import { z } from "zod";

const { outputs, workflow } = createSmithers({
  answer: z.object({ text: z.string() }),
});

const store = createSqliteVectorStore(workflow.db);
```

## Build a Pipeline

Wire together chunking, embedding, and storage:

```ts
import { createRagPipeline } from "smithers-orchestrator/rag";
import { openai } from "@ai-sdk/openai";

const pipeline = createRagPipeline({
  vectorStore: store,
  embeddingModel: openai.embedding("text-embedding-3-small"),
  chunkOptions: { strategy: "markdown", size: 1000, overlap: 200 },
});
```

## Ingest Documents

Load and ingest files:

```ts
await pipeline.ingestFile("./docs/api-reference.md");
await pipeline.ingestFile("./docs/architecture.md");
```

Or create documents from strings:

```ts
import { createDocument } from "smithers-orchestrator/rag";

const doc = createDocument(
  "Smithers uses a unidirectional dataflow model...",
  { metadata: { source: "design-doc" } },
);
await pipeline.ingest([doc]);
```

## Query the Pipeline

```ts
const results = await pipeline.retrieve("How does the scheduler work?", {
  topK: 5,
});

for (const r of results) {
  console.log(`[${r.score.toFixed(3)}] ${r.chunk.content.slice(0, 100)}...`);
}
```

## Give Agents a RAG Tool

Create a tool that agents can call to search the knowledge base:

```ts
import { createRagTool } from "smithers-orchestrator/rag";

const searchDocs = createRagTool(pipeline, {
  name: "search_docs",
  description: "Search project documentation for relevant context",
});
```

Use it in a workflow:

```tsx
import { Workflow, Task, OpenAIAgent } from "smithers-orchestrator";

const agent = new OpenAIAgent({
  model: "gpt-4o",
  tools: { search_docs: searchDocs },
});

export default (
  <Workflow>
    <Task id="answer" output={outputs.answer} agent={agent}>
      Answer the user's question using the search_docs tool.
    </Task>
  </Workflow>
);
```

## Use Namespaces

Keep different document collections separate:

```ts
const apiPipeline = createRagPipeline({
  vectorStore: store,
  embeddingModel: openai.embedding("text-embedding-3-small"),
  chunkOptions: { strategy: "markdown", size: 1000, overlap: 200 },
  namespace: "api-docs",
});

const designPipeline = createRagPipeline({
  vectorStore: store,
  embeddingModel: openai.embedding("text-embedding-3-small"),
  chunkOptions: { strategy: "recursive", size: 800, overlap: 100 },
  namespace: "design-docs",
});
```

## CLI Usage

Ingest and query without writing code:

```bash
# Ingest a markdown file
smithers rag ingest ./docs/api.md --workflow my-workflow.tsx --namespace api-docs

# Query the knowledge base
smithers rag query "authentication flow" --workflow my-workflow.tsx --namespace api-docs --top-k 3
```

## Next Steps

- Read [RAG Concepts](/concepts/rag) for details on chunking strategies and vector storage
- See [Structured Output](/guides/structured-output) for validating agent responses
- See [Model Selection](/guides/model-selection) for choosing embedding models

---

## OpenAPI Tools Quickstart

> Load an OpenAPI spec, create tools, and give them to an agent in under five minutes.
> Source: https://smithers.sh/guides/openapi-tools-quickstart

## Prerequisites

You need a Smithers project and an OpenAPI 3.0+ spec. The spec can be a JSON file, a YAML file, a URL, or a plain JavaScript object.

## Install

No extra dependencies. OpenAPI tools are built into `smithers-orchestrator`.

## Load a Spec and Create Tools

```ts
import { createOpenApiTools } from "smithers-orchestrator";

const tools = await createOpenApiTools("./petstore.json", {
  baseUrl: "https://api.petstore.example.com",
  auth: { type: "bearer", token: process.env.PETSTORE_TOKEN! },
});

console.log(Object.keys(tools));
// ["listPets", "createPet", "getPet"]
```

Each key is the `operationId` from the spec. Each value is an AI SDK `tool()`.

## Use Tools in a Workflow

```tsx
import { Workflow, Task, AnthropicAgent } from "smithers-orchestrator";
import { createOpenApiTools } from "smithers-orchestrator";

const petTools = await createOpenApiTools("./petstore.json", {
  baseUrl: "https://api.petstore.example.com",
  auth: { type: "bearer", token: process.env.PETSTORE_TOKEN! },
});

const agent = new AnthropicAgent({
  model: "claude-sonnet-4-20250514",
  tools: petTools,
});

export default (
  <Workflow>
    <Task id="list-pets" agent={agent}>
      List all available pets and summarize them.
    </Task>
  </Workflow>
);
```

## Filter Operations

```ts
// Only expose read operations
const readTools = await createOpenApiTools("./petstore.json", {
  include: ["listPets", "getPet"],
});

// Expose everything except destructive operations
const safeTools = await createOpenApiTools("./petstore.json", {
  exclude: ["deletePet"],
});
```

## Single Tool

```ts
import { createOpenApiTool } from "smithers-orchestrator";

const listPets = await createOpenApiTool("./petstore.json", "listPets", {
  baseUrl: "https://api.petstore.example.com",
});
```

## Preview Tools from CLI

```bash
smithers openapi list ./petstore.json
```

Output:

```
  listPets — List all pets
  createPet — Create a pet
  getPet — Get a pet by ID

  3 tool(s) from spec
```

## Authentication Options

```ts
// Bearer token
await createOpenApiTools(spec, {
  auth: { type: "bearer", token: "sk-..." },
});

// Basic auth
await createOpenApiTools(spec, {
  auth: { type: "basic", username: "admin", password: "secret" },
});

// API key in header
await createOpenApiTools(spec, {
  auth: { type: "apiKey", name: "X-API-Key", value: "key123", in: "header" },
});

// API key in query string
await createOpenApiTools(spec, {
  auth: { type: "apiKey", name: "api_key", value: "key123", in: "query" },
});

// Custom headers
await createOpenApiTools(spec, {
  headers: { "X-Request-Id": "abc123" },
});
```

## Name Prefixes

When combining tools from multiple specs, use `namePrefix` to avoid collisions:

```ts
const petTools = await createOpenApiTools(petSpec, { namePrefix: "pet_" });
const orderTools = await createOpenApiTools(orderSpec, { namePrefix: "order_" });

const allTools = { ...petTools, ...orderTools };
```

---

## Troubleshooting

> Common setup, runtime, and workflow issues with diagnosis paths.
> Source: https://smithers.sh/guides/troubleshooting

## By Subsystem

| Symptom | Guide |
|---|---|
| Workflow state, stuck runs, SQLite inspection | [Debugging](/guides/debugging) |
| Retries, timeouts, skips, graceful degradation | [Error Handling](/guides/error-handling) |
| File-watcher and prompt reload behavior | [Hot Reload](/guides/hot-reload) |
| Event streaming, approvals, server APIs | [Server Integration](/integrations/server) |
| Workspace snapshots and revert behavior | [VCS Integration](/guides/vcs) |

## Common Failures

**A task did not rerun** -- Smithers resumes by `runId`. If a node already completed for the current `runId`, it is skipped. Use a new `runId` for a fresh run, or inspect persisted attempts first.

**MDX prompt rendered as `[object Object]`** -- The MDX preload was not registered. Confirm `preload.ts` calls `mdxPlugin()` and `bunfig.toml` points at the preload file. If running tests, verify that the `[test]` section in `bunfig.toml` also lists the preload. See [Package Configuration](/reference/package-configuration#test-configuration).

**Workflow code changed but running task did not** -- Hot reload only affects unscheduled work. In-flight tasks continue with the code they started with.

## General Diagnosis

```bash
smithers inspect <runId>
smithers logs <runId> --tail 20 --follow false
```

If the CLI summary is insufficient, query SQLite directly. See [Debugging](/guides/debugging).

## Agent Diagnostics

When a CLI agent fails, Smithers automatically runs diagnostics to check for common issues (binary not installed, API key invalid, rate limit hit). The diagnostic summary is printed to stderr:

```
[diagnostics] claude: api_key_valid=fail: Rate limit detected in error (142ms)
```

The full diagnostic report is also attached to the error's `details.diagnostics` field for programmatic inspection.

---

## <Workflow>

> Root container that defines a named, cacheable workflow with implicit sequential execution of its children.
> Source: https://smithers.sh/components/workflow

```tsx
import { Workflow } from "smithers-orchestrator";
```

## Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `name` | `string` | **(required)** | Unique workflow name. Used in logs, CLI, and run metadata. |
| `cache` | `boolean` | `false` | Enable per-node output caching. Completed tasks are skipped on [resume](/concepts/suspend-and-resume). |
| `children` | `ReactNode` | `undefined` | [`<Task>`](/components/task) and [control-flow components](/concepts/control-flow) that make up the workflow. |

## Implicit sequencing

Direct children execute sequentially, top to bottom. A bare `<Workflow>` behaves identically to wrapping children in [`<Sequence>`](/components/sequence):

```tsx
// These two are equivalent:

<Workflow name="example">
  <Task id="first" output={outputs.first}>{/* ... */}</Task>
  <Task id="second" output={outputs.second}>{/* ... */}</Task>
</Workflow>

<Workflow name="example">
  <Sequence>
    <Task id="first" output={outputs.first}>{/* ... */}</Task>
    <Task id="second" output={outputs.second}>{/* ... */}</Task>
  </Sequence>
</Workflow>
```

An explicit [`<Sequence>`](/components/sequence) is only needed when nesting sequential groups inside [`<Parallel>`](/components/parallel) or other [control-flow components](/concepts/control-flow).

## Caching

When `cache` is enabled, the runtime checks whether a task's output row exists before executing it. If present, the task is skipped and stored output is reused. This makes workflows [resumable](/concepts/suspend-and-resume) after partial failures.

```tsx
<Workflow name="pipeline" cache>
  <Task id="expensive-step" output={outputs.expensiveStep} agent={myAgent}>
    Perform a costly analysis.
  </Task>
  <Task id="cheap-step" output={outputs.cheapStep}>
    {{ status: "done" }}
  </Task>
</Workflow>
```

## Full example

```tsx
import { createSmithers } from "smithers-orchestrator";
import { ToolLoopAgent as Agent } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
import { z } from "zod";

const { Workflow, Task, smithers, outputs } = createSmithers({
  research: z.object({ findings: z.string() }),
  summary: z.object({ summary: z.string() }),
});

const researcher = new Agent({
  model: anthropic("claude-sonnet-4-20250514"),
  instructions: "You are a research assistant.",
});

export default smithers((ctx) => (
  <Workflow name="research-pipeline" cache>
    <Task id="research" output={outputs.research} agent={researcher}>
      {`Research the topic: ${ctx.input.topic}`}
    </Task>
    <Task id="summary" output={outputs.summary}>
      {{ summary: "Workflow complete." }}
    </Task>
  </Workflow>
));
```

## Rendering

`<Workflow>` renders as a `<smithers:workflow>` host element. The runtime traverses this tree to extract [`TaskDescriptor`](/reference/types) objects and build the execution plan.

## Notes

- Every workflow must have exactly one `<Workflow>` at the root.
- `name` is for identification in logs, CLI, and event streams; it need not be globally unique across files.
- Custom Drizzle tables must include `runId` and `nodeId` columns. Tasks inside [`<Loop>`](/components/loop) additionally need `iteration`. `createSmithers(...)` adds these columns automatically for schema-driven tables.

---

## <Task>

> A single executable node that produces output by calling an AI agent, running a compute callback, or emitting a static payload.
> Source: https://smithers.sh/components/task

```tsx
import { Task } from "smithers-orchestrator";
```

Three modes of operation:

- **Agent** -- `agent` provided; children become the prompt.
- **Compute** -- children is a function, no `agent`; function is called at execution time.
- **Static** -- children is a plain value, no `agent`; value is written directly as output.

For human gates, `<Task needsApproval>` pauses before execution. For explicit decision nodes, use [`<Approval>`](/components/approval).

When `needsApproval` and `async` are both set, sequence traversal can continue past the gate, but anything that explicitly depends on this task or reads its output still waits for the approval to resolve.

## Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `id` | `string` | **(required)** | Stable node identity. Must be unique within the workflow. |
| `output` | `z.ZodObject \| Table \| string` | **(required)** | Output destination. [Zod](https://zod.dev) schema from `outputs` (recommended), Drizzle table, or string key. |
| `outputSchema` | `z.ZodObject` | `undefined` | Expected agent output structure. Inferred when `output` is a Zod schema. When provided with a [React JSX](https://react.dev/learn/writing-markup-with-jsx) element child, a `schema` prop containing a JSON example is auto-injected. |
| `agent` | `AgentLike \| AgentLike[]` | `undefined` | [AI SDK](https://ai-sdk.dev) agent or ordered array `[primary, fallback1, ...]`. Agents are tried in order on retries. |
| `fallbackAgent` | `AgentLike` | `undefined` | Single retry fallback agent. Appended to the `agent` chain. |
| `dependsOn` | `string[]` | `undefined` | Explicit dependency on other task IDs. Task waits until all complete. |
| `needs` | `Record<string, string>` | `undefined` | Named dependencies. Keys become context keys, values are task IDs. |
| `deps` | `Record<string, OutputTarget>` | `undefined` | Typed render-time dependencies. Each key resolves from the task with the same id, or from a matching `needs` entry. |
| `allowTools` | `string[]` | `undefined` | CLI-agent tool allowlist override. Supported by `ClaudeCodeAgent`, `PiAgent`, and `GeminiAgent`. |
| `key` | `string` | `undefined` | Standard React key. No effect on execution. |
| `skipIf` | `boolean` | `false` | Skip this task. |
| `needsApproval` | `boolean` | `false` | Pause and wait for approval before executing. |
| `async` | `boolean` | `false` | Only applies with `needsApproval`. When `true`, unrelated downstream flow can continue while approval is pending. |
| `timeoutMs` | `number` | `undefined` | Max execution time in ms. Task fails on timeout. |
| `retries` | `number` | `Infinity` | Retry attempts on failure. Default: infinite with exponential backoff. Set to `0` to disable. |
| `noRetry` | `boolean` | `false` | Disable retries entirely. Equivalent to `retries={0}`. |
| `retryPolicy` | `RetryPolicy` | `{ backoff: "exponential", initialDelayMs: 1000 }` | `{ backoff?: "fixed" \| "linear" \| "exponential", initialDelayMs?: number }`. Delay capped at 5 minutes. |
| `continueOnFail` | `boolean` | `false` | Workflow continues even if this task fails. |
| `cache` | `CachePolicy` | `undefined` | `{ by?: (ctx) => unknown, version?: string }`. Skip re-execution when a cached result with matching key/version exists. |
| `label` | `string` | `undefined` | Human-readable label for UI and metadata. |
| `meta` | `Record<string, unknown>` | `undefined` | Arbitrary metadata on the task descriptor. |
| `scorers` | `ScorersMap` | `undefined` | Map of scorer configs to evaluate task output after execution. See [Evals & Scorers](/concepts/evals). |
| `memory` | `TaskMemoryConfig` | `undefined` | Per-task memory integration. `{ recall?: { namespace?, query?, topK? }, remember?: { namespace?, key? }, threadId? }`. `recall` injects relevant memory fragments into the prompt before execution; `remember` persists the task output back to memory after success. |
| `heartbeatTimeoutMs` | `number` | `undefined` | Heartbeat monitoring timeout in ms. If the executing task does not emit a heartbeat within this window the task is considered stale and fails. Useful for long-running agent tasks to detect hangs. |
| `children` | `string \| Row \| (() => Row \| Promise<Row>) \| ReactNode \| ((deps) => Row \| ReactNode)` | **(required)** | Agent mode: prompt text or JSX rendered to markdown. Compute mode: callback. Static mode: output value. With `deps`, a render function receiving typed upstream outputs. |

## Agent mode

```tsx
import { ToolLoopAgent as Agent } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
import AnalyzePrompt from "./prompts/analyze.mdx";
import ReviewPrompt from "./prompts/review.mdx";

const codeAgent = new Agent({
  model: anthropic("claude-sonnet-4-20250514"),
  instructions: "You are a senior software engineer.",
});

<Task id="analyze" output={outputs.analyze} agent={codeAgent}>
  <AnalyzePrompt repoPath={ctx.input.repoPath} />
</Task>

<Task id="review" output={outputs.review} agent={reviewAgent} deps={{ analyze: outputs.analyze }}>
  {(deps) => <ReviewPrompt code={deps.analyze.code} />}
</Task>
```

### Typed deps

```tsx
<Task id="parse" output={outputs.parsed} agent={parser}>
  <ParsePrompt document={ctx.input.document} />
</Task>

<Task id="summarize" output={outputs.summary} agent={writer} deps={{ parse: outputs.parsed }}>
  {(deps) => <SummaryPrompt extracted={deps.parse.fields} />}
</Task>
```

### CLI tool allowlists

`allowTools` narrows the tool surface for supported CLI agents on a per-task basis.

```tsx
<Task
  id="review"
  output={outputs.review}
  agent={claude}
  allowTools={["read", "grep"]}
>
  Review the patch and summarize risks.
</Task>
```

- `allowTools={[]}` disables CLI tools entirely for supported agents.
- When the workflow run is started with `cliAgentToolsDefault: "explicit-only"`, omitted `allowTools` behaves like `[]` for supported CLI agents.
- Task-level `allowTools` always wins over the run-level default.

When the upstream task id differs from the dep key, pair `deps` with `needs`:

```tsx
<Task
  id="typed-calls"
  output={outputs.typedCalls}
  agent={builder}
  deps={{ contract: outputs.contractSource }}
  needs={{ contract: "parse-contract" }}
>
  {(deps) => <TypedCallsPrompt contract={deps.contract} />}
</Task>
```

### [Structured output](/guides/structured-output) with outputSchema

When `outputSchema` is provided and children are a React element, a `schema` prop containing a JSON example is auto-injected. The MDX template can reference `{props.schema}`.

```tsx
import { z } from "zod";

const analysisSchema = z.object({
  summary: z.string(),
  risk: z.enum(["low", "medium", "high"]),
  files: z.array(z.string()),
});

// When `output` is a Zod schema, outputSchema is inferred automatically.
<Task id="analyze" output={outputs.analysis} agent={codeAgent}>
  <AnalysisPrompt repo={ctx.input.repoPath} />
</Task>

// You can still pass outputSchema explicitly to override:
<Task
  id="analyze"
  output={outputs.analysis}
  agent={codeAgent}
  outputSchema={analysisSchema}
>
  <AnalysisPrompt repo={ctx.input.repoPath} />
</Task>
```

### Heartbeat monitoring

Use `heartbeatTimeoutMs` to detect stalled long-running agent tasks. The agent must emit heartbeats periodically; if none arrive within the timeout window, the task fails:

```tsx
<Task
  id="long-migration"
  output={outputs.migration}
  agent={migrationAgent}
  heartbeatTimeoutMs={60_000}
  timeoutMs={3_600_000}
>
  Run the database migration. Report progress periodically.
</Task>
```

This is distinct from `timeoutMs` — `timeoutMs` caps total execution time, while `heartbeatTimeoutMs` detects hangs mid-execution.

## Compute mode

Children is a function, no `agent`. Called at execution time; return value becomes output. Sync or async.

```tsx
// Sync callback
<Task id="calculate" output={outputs.results}>
  {() => ({ total: items.length, status: "complete" })}
</Task>

// Async callback — run shell commands
<Task id="validate" output={outputs.validate} timeoutMs={30000} retries={1}>
  {async () => {
    const testResult = await $`bun test`.quiet();
    const typeResult = await $`tsc --noEmit`.quiet();
    return {
      testsPass: testResult.exitCode === 0,
      typesPass: typeResult.exitCode === 0,
    };
  }}
</Task>
```

If the callback throws, the task fails and follows normal retry/`continueOnFail` behavior.

## Static mode

No `agent`, children is not a function. Value is written directly as output. `deps` works in static mode.

```tsx
// Object payload
<Task id="config" output={outputs.config}>
  {{ environment: "production", debug: false }}
</Task>

// Computed payload from upstream output
<Task id="summary" output={outputs.summary} deps={{ results: outputs.results }}>
  {(deps) => ({
    total: deps.results.count,
    status: "complete",
  })}
</Task>
```

## Output resolution

The `output` prop accepts three forms:

**Zod schema from `outputs` (recommended)** -- type-checked at compile time; resolved to the correct table via `zodToKeyName`.

```tsx
const { outputs } = createSmithers({ results: z.object({ done: z.boolean() }) });

<Task id="step" output={outputs.results}>
  {{ done: true }}
</Task>
```

**Drizzle table object** -- runtime calls `getTableName()` to determine storage.

```tsx
<Task id="step" output={myDrizzleTable}>
  {{ done: true }}
</Task>
```

**String key (escape hatch)** -- not type-checked. Resolved at execution time.

```tsx
<Task id="step" output="results">
  {{ done: true }}
</Task>
```

## Full example

### Schema-driven (recommended)

```tsx
import { createSmithers } from "smithers-orchestrator";
import { ToolLoopAgent as Agent } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
import { z } from "zod";

const { Workflow, smithers, outputs } = createSmithers({
  analysis: z.object({ summary: z.string() }),
  setup: z.object({ files: z.array(z.string()) }),
});

const codeAgent = new Agent({
  model: anthropic("claude-sonnet-4-20250514"),
  instructions: "You are a senior software engineer.",
});

export default smithers((ctx) => (
  <Workflow name="tasks-demo">
    <Task id="analyze" output={outputs.analysis} agent={codeAgent}>
      {`Analyze: ${ctx.input.description}`}
    </Task>
    <Task id="setup" output={outputs.setup}>
      {{ files: ["README.md", "package.json"] }}
    </Task>
  </Workflow>
));
```

### Custom Drizzle table output (advanced)

```tsx
import { z } from "zod";
import { sqliteTable, text, primaryKey } from "drizzle-orm/sqlite-core";

const auditTable = sqliteTable(
  "audit",
  {
    runId: text("run_id").notNull(),
    nodeId: text("node_id").notNull(),
    status: text("status").notNull(),
    details: text("details").notNull(),
  },
  (t) => ({
    pk: primaryKey({ columns: [t.runId, t.nodeId] }),
  }),
);

const auditSchema = z.object({
  status: z.enum(["ok", "needs-follow-up"]),
  details: z.string(),
});

<Task id="audit" output={auditTable} outputSchema={auditSchema} agent={reviewAgent}>
  Summarize the review outcome for the audit log.
</Task>
```

Custom Drizzle tables must be created and migrated separately. Define the workflow with `createSmithers(...)` as usual.

## [Error handling](/guides/error-handling)

| Scenario | Behavior |
| --- | --- |
| Duplicate `id` | Throws `"Duplicate Task id detected: <id>"` at render time. |
| Missing `output` | Throws `"Task <id> is missing output table."` at render time. |
| Agent timeout | Fails after `timeoutMs`. Retries if `retries > 0`. |
| Agent failure | Fails. Retries if `retries > 0`. Continues if `continueOnFail`. |
| Callback throws | Same retry/`continueOnFail` behavior as agent failures. |
| Callback timeout | Fails after `timeoutMs`. Retries if `retries > 0`. |

## Agent JSON Output Extraction

When `outputSchema` is set, the engine extracts structured JSON from the agent's text response using a multi-strategy pipeline:

1. **Code fence extraction** -- looks for ` ```json ` fenced blocks and parses the content.
2. **Balanced brace extraction** -- finds the outermost `{...}` using brace-depth counting, handling nested objects correctly.
3. **Last balanced JSON** -- if multiple JSON objects appear, the last complete one is used (agents often produce the final answer last).

After extraction, the JSON is validated against `outputSchema`. If validation fails:

- The engine sends a **retry prompt** back to the agent describing the schema violation and asking for corrected output.
- This schema-validation retry happens within the same attempt (it does not consume a `retries` count).
- If the agent still fails to produce valid JSON after retries, the attempt fails.

### Auth Failure Circuit Breaker

If an agent returns an authentication error (e.g., invalid API key, expired token), the engine short-circuits without retrying. Auth failures are terminal -- retrying with the same credentials will not produce a different result.

### Non-Idempotent Tool Retry Warnings

When a task is retried (via `retries` or manual `retryTask()`), the engine checks whether non-idempotent tools (tools with `sideEffect: true` and `idempotent: false`) were called in prior attempts. If so, a warning message is prepended to the agent's prompt on the next attempt, alerting it that certain side effects may have already occurred.

## Notes

- Custom Drizzle tables must include `runId` and `nodeId` columns. Tasks inside [`<Loop>`](/components/loop) additionally need `iteration`. `createSmithers(...)` adds these automatically.
- `id` is the `nodeId` in the task descriptor; uniqueness is enforced at render time.
- In agent mode, JSX/MDX children are rendered to markdown (not HTML) before being sent to the agent.

---

## <Sequence>

> Execute child tasks one after another in the order they appear.
> Source: https://smithers.sh/components/sequence

```tsx
import { Sequence } from "smithers-orchestrator";
```

## Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `skipIf` | `boolean` | `false` | Skip the entire sequence. Returns `null`; no children are mounted. |
| `children` | `ReactNode` | `undefined` | [`<Task>`](/components/task) and [control-flow components](/concepts/control-flow) to execute sequentially. |

## Usage

```tsx
<Workflow name="pipeline">
  <Sequence>
    <Task id="fetch" output={outputs.fetch}>
      {{ url: "https://api.example.com/data" }}
    </Task>
    <Task id="transform" output={outputs.transform} agent={transformer}>
      {`Transform the data: ${ctx.output(outputs.fetch, { nodeId: "fetch" }).url}`}
    </Task>
    <Task id="store" output={outputs.store}>
      {{ stored: true }}
    </Task>
  </Sequence>
</Workflow>
```

## When explicit Sequence is needed

[`<Workflow>`](/components/workflow) sequences direct children implicitly. An explicit `<Sequence>` is needed inside other [control-flow components](/concepts/control-flow):

```tsx
<Workflow name="build-and-deploy">
  <Parallel maxConcurrency={2}>
    {/* Each branch runs its own steps in order */}
    <Sequence>
      <Task id="build-frontend" output={outputs.buildFrontend}>
        {{ status: "built" }}
      </Task>
      <Task id="test-frontend" output={outputs.testFrontend}>
        {{ passed: true }}
      </Task>
    </Sequence>
    <Sequence>
      <Task id="build-backend" output={outputs.buildBackend}>
        {{ status: "built" }}
      </Task>
      <Task id="test-backend" output={outputs.testBackend}>
        {{ passed: true }}
      </Task>
    </Sequence>
  </Parallel>
</Workflow>
```

The two `<Sequence>` groups run in parallel; tasks within each group run sequentially.

## Conditional skipping

```tsx
<Sequence skipIf={ctx.input.skipTests}>
  <Task id="unit-tests" output={outputs.unitTests}>
    {{ passed: true }}
  </Task>
  <Task id="integration-tests" output={outputs.integrationTests}>
    {{ passed: true }}
  </Task>
</Sequence>
```

When `skipIf` is `true`, the component returns `null`. No children are mounted into the execution plan.

## Rendering

`<Sequence>` renders as a `<smithers:sequence>` host element (or `null` when skipped). The runtime assigns ordinals to tasks in source order.

## Notes

- Child ordering matches JSX source order.
- Nestable inside [`<Parallel>`](/components/parallel), [`<Branch>`](/components/branch), [`<Loop>`](/components/loop), or another `<Sequence>`.
- An empty `<Sequence>` is valid and produces no tasks.

---

## <Parallel>

> Execute child tasks concurrently with optional concurrency limits.
> Source: https://smithers.sh/components/parallel

```tsx
import { Parallel } from "smithers-orchestrator";
```

## Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `id` | `string` | `undefined` | Optional stable id for tracking and deduping. |
| `maxConcurrency` | `number` | `Infinity` | Max simultaneous children. Remaining tasks queue until a slot opens. |
| `skipIf` | `boolean` | `false` | Skip the entire group. Returns `null`; no children are mounted. |
| `children` | `ReactNode` | `undefined` | [`<Task>`](/components/task) and [control-flow components](/concepts/control-flow) to execute concurrently. |

## Basic usage

```tsx
<Workflow name="checks">
  <Parallel>
    <Task id="lint" output={outputs.lint}>
      {{ errors: 0 }}
    </Task>
    <Task id="typecheck" output={outputs.typecheck}>
      {{ passed: true }}
    </Task>
    <Task id="test" output={outputs.test}>
      {{ passed: true }}
    </Task>
  </Parallel>
</Workflow>
```

## Limiting concurrency

```tsx
<Parallel maxConcurrency={2}>
  <Task id="analyze-repo-1" output={outputs.analyzeRepo1} agent={analyst}>
    Analyze repository alpha.
  </Task>
  <Task id="analyze-repo-2" output={outputs.analyzeRepo2} agent={analyst}>
    Analyze repository beta.
  </Task>
  <Task id="analyze-repo-3" output={outputs.analyzeRepo3} agent={analyst}>
    Analyze repository gamma.
  </Task>
  <Task id="analyze-repo-4" output={outputs.analyzeRepo4} agent={analyst}>
    Analyze repository delta.
  </Task>
</Parallel>
```

At most two agent calls run simultaneously. As each completes, the next queued task starts.

## Combining with [Sequence](/components/sequence)

```tsx
<Workflow name="ci">
  <Parallel maxConcurrency={3}>
    <Sequence>
      <Task id="build-web" output={outputs.buildWeb}>{{ ok: true }}</Task>
      <Task id="deploy-web" output={outputs.deployWeb}>{{ ok: true }}</Task>
    </Sequence>
    <Sequence>
      <Task id="build-api" output={outputs.buildApi}>{{ ok: true }}</Task>
      <Task id="deploy-api" output={outputs.deployApi}>{{ ok: true }}</Task>
    </Sequence>
  </Parallel>
</Workflow>
```

The two [`<Sequence>`](/components/sequence) groups run in parallel. Within each, tasks run sequentially.

## Conditional skipping

```tsx
<Parallel skipIf={!ctx.input.runChecks}>
  <Task id="lint" output={outputs.lint}>{{ errors: 0 }}</Task>
  <Task id="test" output={outputs.test}>{{ passed: true }}</Task>
</Parallel>
```

## Rendering

`<Parallel>` renders as a `<smithers:parallel>` host element (or `null` when skipped). Each child receives `parallelGroupId` and `parallelMaxConcurrency` in its task descriptor.

## Notes

- Omitting `maxConcurrency` (or setting `Infinity`) starts all children simultaneously.
- The group completes when all children finish (or fail, if `continueOnFail` is set on individual tasks).
- Nestable inside [`<Sequence>`](/components/sequence), [`<Branch>`](/components/branch), [`<Loop>`](/components/loop), or another `<Parallel>`.
- An empty `<Parallel>` is valid and completes immediately.

---

## <Branch>

> Conditional branching that executes one of two paths based on a boolean condition.
> Source: https://smithers.sh/components/branch

```tsx
import { Branch } from "smithers-orchestrator";
```

## Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `if` | `boolean` | **(required)** | Condition. `true` executes `then`; `false` executes `else`. |
| `then` | `ReactElement` | **(required)** | Element to render when `true`. |
| `else` | `ReactElement` | `undefined` | Element to render when `false`. If omitted, nothing executes. |
| `skipIf` | `boolean` | `false` | Skip the entire branch regardless of condition. Returns `null`. |

## Basic usage

```tsx
<Workflow name="deploy-pipeline">
  <Task id="test" output={outputs.test}>
    {{ passed: true, error: null }}
  </Task>

  <Branch
    if={ctx.output(outputs.test, { nodeId: "test" }).passed}
    then={
      <Task id="deploy" output={outputs.deploy}>
        {{ url: "https://prod.example.com" }}
      </Task>
    }
    else={
      <Task id="notify-failure" output={outputs.notifyFailure}>
        {{ message: "Tests failed, skipping deploy." }}
      </Task>
    }
  />
</Workflow>
```

## Without an else branch

```tsx
<Branch
  if={ctx.input.needsReview}
  then={
    <Task id="review" output={outputs.review} agent={reviewAgent}>
      Review the changes.
    </Task>
  }
/>
```

## Complex sub-graphs

Each branch accepts any workflow element. Wrap multiple elements in [`<Sequence>`](/components/sequence) or [`<Parallel>`](/components/parallel):

```tsx
<Branch
  if={ctx.output(outputs.triage, { nodeId: "triage" }).severity === "critical"}
  then={
    <Sequence>
      <Task id="hotfix" output={outputs.hotfix} agent={codeAgent}>
        Write a hotfix for the critical issue.
      </Task>
      <Task id="emergency-deploy" output={outputs.emergencyDeploy}>
        {{ deployed: true }}
      </Task>
    </Sequence>
  }
  else={
    <Task id="add-to-backlog" output={outputs.backlog}>
      {{ queued: true }}
    </Task>
  }
/>
```

## Conditional skipping

```tsx
<Branch
  skipIf={ctx.input.dryRun}
  if={testsPass}
  then={<Task id="deploy" output={outputs.deploy}>{{ ok: true }}</Task>}
/>
```

## Condition evaluation

The `if` prop is evaluated at render time. Smithers [re-renders the tree each frame](/concepts/reactivity), so conditions can depend on outputs of completed tasks:

```tsx
const check = ctx.outputMaybe(outputs.check, { nodeId: "check" });

return (
  <Workflow name="adaptive">
    <Task id="check" output={outputs.check} agent={checkAgent}>
      Check whether the system is healthy.
    </Task>
    <Branch
      if={check?.healthy === true}
      then={<Task id="proceed" output={outputs.proceed}>{{ ok: true }}</Task>}
      else={<Task id="remediate" output={outputs.remediate} agent={fixAgent}>Fix it.</Task>}
    />
  </Workflow>
);
```

Use `ctx.outputMaybe()` when the upstream task may not have completed yet.

## Rendering

`<Branch>` renders the selected child wrapped in a `<smithers:branch>` host element. Only the selected branch's tasks are mounted. The other branch is absent from the task graph.

## Notes

- Only one branch executes per render frame.
- `then` and `else` each accept a single `ReactElement`. Wrap multiple elements in [`<Sequence>`](/components/sequence) or [`<Parallel>`](/components/parallel).
- Conditions are re-evaluated each render frame, enabling data-dependent [control flow](/concepts/control-flow).

---

## <Loop>

> Iterative loop that re-executes its children until a condition is met or the maximum iteration count is reached.
> Source: https://smithers.sh/components/loop

```tsx
import { Loop } from "smithers-orchestrator";
```

## Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `id` | `string` | auto-generated | Loop identifier. Auto-generated from tree position if omitted. |
| `until` | `boolean` | **(required)** | Stop condition. Re-evaluated each iteration. Loop exits when `true`. |
| `maxIterations` | `number` | `5` | Maximum iterations. Loop stops regardless of `until`. |
| `onMaxReached` | `"fail" \| "return-last"` | `"return-last"` | Behavior at limit. `"fail"`: workflow fails. `"return-last"`: keep final output and continue. |
| `continueAsNewEvery` | `number` | `undefined` | Number of iterations after which the loop triggers a [continue-as-new](/components/continue-as-new) to prevent unbounded workflow history growth. The workflow state is checkpointed and execution resumes in a fresh run with a clean history. |
| `skipIf` | `boolean` | `false` | Skip the loop entirely. Returns `null`. |
| `children` | `ReactNode` | `undefined` | [`<Task>`](/components/task) and [control-flow components](/concepts/control-flow) to execute each iteration. |

## Basic usage

```tsx
<Workflow name="review-loop">
  <Loop
    until={ctx.outputMaybe(outputs.review, { nodeId: "review" })?.approved === true}
    maxIterations={3}
    onMaxReached="return-last"
  >
    <Task id="review" output={outputs.review} agent={reviewAgent}>
      Review the code and decide whether to approve.
    </Task>
  </Loop>
</Workflow>
```

## Iteration state

Each iteration increments an internal counter exposed on the context:

- **`ctx.iteration`** -- current iteration number (0-indexed).
- **`ctx.iterations`** -- map of loop ids to current iteration numbers.

Tasks inside `<Loop>` receive the iteration number in their descriptor. Custom Drizzle tables must include `iteration` in the primary key. `createSmithers(...)` adds this automatically for schema-driven outputs.

```tsx
const reviewTable = sqliteTable(
  "review",
  {
    runId: text("run_id").notNull(),
    nodeId: text("node_id").notNull(),
    iteration: integer("iteration").notNull().default(0),
    approved: integer("approved", { mode: "boolean" }).notNull(),
  },
  (t) => ({
    pk: primaryKey({ columns: [t.runId, t.nodeId, t.iteration] }),
  }),
);
```

## Accessing previous iteration output with `ctx.latest()`

`ctx.latest(table, nodeId)` retrieves the most recent output for a task across all iterations.

| Parameter | Type | Description |
| --- | --- | --- |
| `table` | `ZodObject \| Table \| string` | Output target: [Zod](https://zod.dev) schema from `outputs`, Drizzle table, or schema key (not SQLite table name). |
| `nodeId` | `string` | The `id` prop of the target [`<Task>`](/components/task). |

```tsx
const { Workflow, smithers, outputs } = createSmithers({
  draft: z.object({ text: z.string(), score: z.number() }),
  review: z.object({ approved: z.boolean(), feedback: z.string() }),
});

export default smithers((ctx) => {
  const latestDraft = ctx.latest("draft", "write");   // string key, not table name
  const latestReview = ctx.latest("review", "review");

  return (
    <Workflow name="refine-loop">
      <Loop
        until={latestReview?.approved === true}
        maxIterations={5}
      >
        <Sequence>
          <Task id="write" output={outputs.draft} agent={writer}>
            {latestReview
              ? `Improve the draft. Feedback: ${latestReview.feedback}`
              : `Write a first draft about: ${ctx.input.topic}`}
          </Task>
          <Task id="review" output={outputs.review} agent={reviewer}>
            {`Review this draft (score: ${latestDraft?.score ?? "N/A"}):\n${latestDraft?.text ?? ""}`}
          </Task>
        </Sequence>
      </Loop>
    </Workflow>
  );
});
```

The [re-render cycle](/concepts/reactivity) drives iteration: after tasks complete, the tree re-renders with new outputs in context, `until` is re-evaluated, and the next iteration starts if not satisfied.

> **Tip:** `ctx.latest()` returns the highest-iteration result. `ctx.output()` defaults to the current iteration, which may not have output yet at render time.


## Accessing iteration count

```tsx
<Loop
  until={ctx.iterationCount("review", "review") >= 2}
  maxIterations={5}
>
  <Task id="review" output={outputs.review} agent={reviewAgent}>
    {`This is iteration ${ctx.iteration}. Review the code.`}
  </Task>
</Loop>
```

## Multiple loops

Use `id` to distinguish multiple loops in the same workflow:

```tsx
<Workflow name="multi-loop">
  <Loop id="code-loop" until={codeApproved} maxIterations={3}>
    <Task id="write-code" output={outputs.writeCode} agent={codeAgent}>
      Write the implementation.
    </Task>
  </Loop>
  <Loop id="review-loop" until={reviewApproved} maxIterations={3}>
    <Task id="review-code" output={outputs.reviewCode} agent={reviewAgent}>
      Review the implementation.
    </Task>
  </Loop>
</Workflow>
```

When `id` is omitted, a stable id is generated from tree position.

## onMaxReached behavior

| Value | Behavior |
| --- | --- |
| `"return-last"` | Keep final iteration output; workflow continues. Default. |
| `"fail"` | Workflow fails with max-iteration error. |

```tsx
// Fail the workflow if we can't converge in 10 iterations
<Loop until={converged} maxIterations={10} onMaxReached="fail">
  <Task id="optimize" output={outputs.optimize} agent={optimizer}>
    Optimize the solution.
  </Task>
</Loop>
```

## Full example

```tsx
import { createSmithers } from "smithers-orchestrator";
import { ToolLoopAgent as Agent } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
import { z } from "zod";

const { Workflow, Task, smithers, outputs } = createSmithers({
  review: z.object({
    approved: z.boolean(),
    feedback: z.string().nullable(),
  }),
});

const reviewAgent = new Agent({
  model: anthropic("claude-sonnet-4-20250514"),
  instructions: "You are a thorough code reviewer.",
});

export default smithers((ctx) => (
  <Workflow name="iterative-review">
    <Loop
      until={
        ctx.outputMaybe(outputs.review, { nodeId: "review" })?.approved === true
      }
      maxIterations={5}
      onMaxReached="return-last"
    >
      <Task id="review" output={outputs.review} agent={reviewAgent}>
        {`Review this code and either approve or provide feedback:\n\n${ctx.input.code}`}
      </Task>
    </Loop>
  </Workflow>
));
```

## Ralph alias

`Loop` is also exported as `Ralph`, which is the original name for this component. The `Ralph` export is deprecated — use `Loop` in new code:

```tsx
import { Loop } from "smithers-orchestrator";     // preferred
import { Ralph } from "smithers-orchestrator";     // deprecated alias, same component
```

Both names render the same `<smithers:ralph>` host element. Many composite components (Supervisor, [ReviewLoop](/guides/review-loop), Optimizer, Debate) use `Loop` internally for their iteration logic.

## Infinite loop pattern

To create an intentionally infinite loop (for polling, monitoring, or long-running daemons), set `until={false}` with no `maxIterations`, and use `continueAsNewEvery` to prevent unbounded history growth:

```tsx
<Loop
  id="monitor"
  until={false}
  continueAsNewEvery={100}
>
  <Task id="check" output={outputs.status} agent={monitorAgent}>
    Check system health and report any anomalies.
  </Task>
</Loop>
```

The `continueAsNewEvery` prop checkpoints the workflow state every N iterations and resumes in a fresh execution, keeping the event history bounded.

## Rendering

`<Loop>` renders as a `<smithers:ralph>` host element (or `null` when skipped). The runtime manages iteration state and re-renders the tree each iteration.

## Nested loops

Direct nesting -- `<Loop>` as immediate child of `<Loop>` -- throws at render time. Wrap the inner loop in [`<Sequence>`](/components/sequence):

```tsx
<Workflow name="nested-loops">
  <Loop id="outer" until={outerDone} maxIterations={5}>
    <Sequence>
      <Loop id="inner" until={innerDone} maxIterations={3}>
        <Task id="innerTask" output="innerOutput" agent={agent}>
          Run the inner loop body.
        </Task>
      </Loop>
    </Sequence>
  </Loop>
</Workflow>
```

## Restrictions

- **Direct nesting throws.** Wrap the inner `<Loop>` in [`<Sequence>`](/components/sequence).
- **Duplicate ids throw.** Two loops cannot share the same `id`.

## Notes

- `until` is evaluated at render time each frame. Typically references loop body output via `ctx.outputMaybe()`.
- Use `ctx.outputMaybe()` for `until` since output does not exist on the first render.
- Custom Drizzle tables for tasks inside `<Loop>` must include `iteration` in the primary key. `createSmithers(...)` handles this automatically.
- The iteration counter resets to 0 at the start of each workflow run.

---

## <Approval>

> A first-class JSX approval node that pauses durably and resolves to an approval decision, selection, or ranking value.
> Source: https://smithers.sh/components/approval

Pauses the [workflow](/components/workflow) until a human approves or denies. `mode="approve"` writes an `ApprovalDecision` to the configured output:

```ts
type ApprovalDecision = {
  approved: boolean;
  note: string | null;
  decidedBy: string | null;
  decidedAt: string | null;
};
```

`decidedAt` is reserved for compatibility, but Smithers keeps the actual approval timestamp in internal approval records and the event log instead of the durable task output.

## Import

```tsx
import { Approval, approvalDecisionSchema } from "smithers-orchestrator";
```

## Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `id` | `string` | **(required)** | Unique node id within the workflow. |
| `mode` | `"approve" \| "select" \| "rank"` | `"approve"` | Approval shape. `"approve"` returns a boolean decision, `"select"` returns one option, `"rank"` returns an ordered list. |
| `options` | `ApprovalOption[]` | `undefined` | Required for `mode="select"` and `mode="rank"`. |
| `output` | `z.ZodObject \| Table \| string` | **(required)** | Where to persist the decision. Zod schema from `outputs` (recommended), Drizzle table, or string key. |
| `outputSchema` | `z.ZodObject` | `approvalDecisionSchema` | Override the decision schema (manual DB API). |
| `request` | `{ title: string; summary?: string; metadata?: Record<string, unknown> }` | **(required)** | Human-facing request. `title` becomes the node label. |
| `onDeny` | `"fail" \| "continue" \| "skip"` | `"fail"` | Behavior after denial. `"continue"` and `"skip"` still persist the denial. |
| `allowedScopes` | `string[]` | `undefined` | Optional gateway scopes allowed to decide this approval. |
| `allowedUsers` | `string[]` | `undefined` | Optional gateway user IDs allowed to decide this approval. |
| `autoApprove` | `ApprovalAutoApprove` | `undefined` | Auto-approval policy. Supports immediate auto-approval, approval-after-history, and audited auto-approvals. |
| `async` | `boolean` | `false` | When `true`, unrelated downstream flow can continue while this approval is pending. Explicit dependencies still wait for the resolved decision. |
| `dependsOn` | `string[]` | `undefined` | Task IDs that must complete first. |
| `needs` | `Record<string, string>` | `undefined` | Named deps. Keys become context keys, values are task IDs. |
| `skipIf` | `boolean` | `false` | Skip this node entirely. |
| `timeoutMs` | `number` | `undefined` | Max wait in ms. Node fails on timeout. |
| `retries` | `number` | `0` | Retry attempts before failure. |
| `retryPolicy` | `RetryPolicy` | `undefined` | `{ backoff?: "fixed" \| "linear" \| "exponential", initialDelayMs?: number }` |
| `continueOnFail` | `boolean` | `false` | Workflow continues even if this node fails. |
| `cache` | `CachePolicy` | `undefined` | `{ by?: (ctx) => unknown, version?: string }`. Skip re-execution on cache hit. |
| `label` | `string` | `request.title` | Display label override. |
| `meta` | `Record<string, unknown>` | `undefined` | Extra metadata merged with request fields. |

## Schema-driven Example

```tsx
import {
  Approval,
  Sequence,
  Task,
  Workflow,
  approvalDecisionSchema,
  createSmithers,
} from "smithers-orchestrator";
import { z } from "zod";

const { smithers, outputs } = createSmithers({
  publishApproval: approvalDecisionSchema,
  publishResult: z.object({
    status: z.enum(["published", "rejected"]),
  }),
});

export default smithers((ctx) => {
  const decision = ctx.outputMaybe(outputs.publishApproval, {
    nodeId: "approve-publish",
  });

  return (
    <Workflow name="publish-flow">
      <Sequence>
        <Approval
          id="approve-publish"
          output={outputs.publishApproval}
          request={{
            title: "Publish the draft?",
            summary: "Human review is required before production publish.",
            metadata: { channel: "blog" },
          }}
          onDeny="continue"
        />

        {decision ? (
          <Task id="record-decision" output={outputs.publishResult}>
            {{
              status: decision.approved ? "published" : "rejected",
            }}
          </Task>
        ) : null}
      </Sequence>
    </Workflow>
  );
});
```

## Manual API Example

Pass `outputSchema={approvalDecisionSchema}` when `output` is a Drizzle table.

```tsx
<Approval
  id="approve-deploy"
  output={deployApprovalTable}
  outputSchema={approvalDecisionSchema}
  request={{
    title: "Deploy to production?",
    summary: "Build 2026.03.15 passed all checks.",
  }}
/>
```

## Selection and ranking modes

`<Approval>` can also return typed non-boolean outputs.

```tsx
import {
  Approval,
  approvalRankingSchema,
  approvalSelectionSchema,
} from "smithers-orchestrator";

<Approval
  id="pick-plan"
  mode="select"
  output={outputs.selection}
  request={{ title: "Pick a rollout plan" }}
  options={[
    { key: "canary", label: "Canary" },
    { key: "regional", label: "Regional" },
  ]}
/>

<Approval
  id="rank-plans"
  mode="rank"
  output={outputs.ranking}
  request={{ title: "Rank the rollout plans" }}
  options={[
    { key: "canary", label: "Canary" },
    { key: "regional", label: "Regional" },
    { key: "global", label: "Global" },
  ]}
/>
```

- `mode="select"` returns `{ selected: string, notes: string | null }`
- `mode="rank"` returns `{ ranked: string[], notes: string | null }`

## Scoped approvals and auto-approval

```tsx
<Approval
  id="deploy-prod"
  output={outputs.deployApproval}
  request={{ title: "Deploy to production?" }}
  allowedScopes={["approve"]}
  allowedUsers={["user:oncall", "user:release-manager"]}
  autoApprove={{ after: 2, audit: true }}
/>
```

- `allowedScopes` and `allowedUsers` are enforced by [`Gateway`](/integrations/gateway).
- `autoApprove={{ after: N }}` auto-approves after `N` consecutive manual approvals for the same workflow node.
- `audit: true` preserves an approval record and emits `ApprovalAutoApproved`.

The full `ApprovalAutoApprove` type:

```ts
type ApprovalAutoApprove = {
  after?: number;
  condition?: (ctx: WorkflowContext) => boolean;
  audit?: boolean;
  revertOn?: (ctx: WorkflowContext) => boolean;
};
```

| Field | Description |
| --- | --- |
| `after` | Auto-approve after this many consecutive manual approvals for the same node. |
| `condition` | Predicate evaluated at render time. When it returns `true`, the node is auto-approved immediately without waiting for human input. |
| `audit` | When `true`, an approval record is written and `ApprovalAutoApproved` is emitted even for auto-approvals. Defaults to `true`. |
| `revertOn` | Predicate evaluated at render time. When it returns `true`, a previously triggered auto-approval is reverted and the node goes back to waiting for human input. |

`condition` and `revertOn` are re-evaluated each render, so they can react to upstream task output or workflow state.

## Behavior

- Workflow enters [`waiting-approval`](/concepts/approvals) when this node is reached.
- With `async`, the run can keep traversing unrelated later nodes while this approval is pending.
- `smithers approve` / `smithers deny` updates the record durably.
- On [resume](/concepts/suspend-and-resume), the node resolves to a decision object; downstream JSX branches on the value.
- `onDeny="fail"` -- hard gate.
- `onDeny="continue"` -- branch on `decision.approved`.
- Use `ctx.outputMaybe(...)` when branching on an async approval's output, since the decision may not exist yet during earlier renders.

## Metrics

Async approvals contribute to the Prometheus gauge `smithers_external_wait_async_pending{kind="approval"}` while waiting for human input.

## Durable deferred resolution

`<Approval>` uses a durable deferred mechanism to survive process restarts. When the node enters `waiting-approval` state, an `@effect/workflow DurableDeferred` is created and awaited by the executing task fiber. The deferred is keyed to the run, node, and iteration, so it survives process restarts: if the worker crashes while waiting, the next worker that picks up the task will re-await the same deferred and receive the resolution as soon as a human submits a decision.

When `smithers approve` or `smithers deny` is called, `bridgeApprovalResolve` resolves the deferred, which unblocks the awaiting fiber and lets the compute function proceed to read the decision from the database. No polling is needed.

## `<Approval>` vs `needsApproval`

| Use | When |
| --- | --- |
| `<Approval>` | Decision must be persisted as data and consumed by downstream nodes. |
| `needsApproval` on [`<Task>`](/components/task) | Simple pause before a task; no separate decision value needed. |

---

## <MergeQueue>

> Queue tasks so at most maxConcurrency run; defaults to 1.
> Source: https://smithers.sh/components/merge-queue

## Import

```tsx
import { MergeQueue } from "smithers-orchestrator";
```

## Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `id` | `string` | auto-generated | Stable id for the queue group. |
| `maxConcurrency` | `number` | `1` | Max simultaneous child tasks. |
| `skipIf` | `boolean` | `false` | Skip the entire subtree. |
| `children` | `ReactNode` | -- | Child tasks/control-flow nodes. |

## Examples

### Single-lane (default)

```tsx
<MergeQueue>
  <Task id="lint" output={outputs.outputC}>{{ value: 1 }}</Task>
  <Task id="build" output={outputs.outputC}>{{ value: 2 }}</Task>
  <Task id="test" output={outputs.outputC}>{{ value: 3 }}</Task>
</MergeQueue>
```

### Custom concurrency

```tsx
<MergeQueue maxConcurrency={2}>
  {items.map((it, i) => (
    <Task key={i} id={`t${i}`} output={outputs.outputC}>{{ value: i }}</Task>
  ))}
</MergeQueue>
```

### Nesting with Parallel

```tsx
<Parallel maxConcurrency={3}>
  <MergeQueue>
    {items.map((it, i) => (
      <Task key={i} id={`q${i}`} output={outputs.outputC}>{{ value: i }}</Task>
    ))}
  </MergeQueue>
  <Task id="other" output={outputs.outputC}>{{ value: 99 }}</Task>
</Parallel>
```

The inner `<MergeQueue>` constrains its children to 1-at-a-time. The outer `<Parallel>` runs unrelated siblings concurrently up to its own limit.

## Internals

Renders as `<smithers:merge-queue>` (or `null` when skipped). Each child task receives `parallelGroupId` and `parallelMaxConcurrency` in its descriptor. The engine enforces the concurrency cap per group.

## Notes

- Defaults to single-lane (`maxConcurrency = 1`).
- Innermost group determines the effective cap for its descendants.
- Tasks outside the queue are unaffected by its limit.

---

## <Worktree>

> Execute a subtree in a separate JJ worktree rooted at `path`.
> Source: https://smithers.sh/components/worktree

## Import

```tsx
import { Worktree, Task, Parallel, MergeQueue } from "smithers-orchestrator";
```

## Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `id` | `string` | auto-generated | Stable id for tracking/deduping. |
| `path` | `string` | -- | Filesystem path for the [JJ](https://martinvonz.github.io/jj) [worktree](https://git-scm.com/docs/git-worktree) root. Required; non-empty. |
| `branch` | `string` | `undefined` | Branch to check out. Omit to use the current branch. |
| `baseBranch` | `string` | `"main"` | Base branch/revision for sync or creation. |
| `skipIf` | `boolean` | `false` | Skip the subtree. |
| `children` | `ReactNode` | -- | Nested [tasks](/components/task) and [control-flow nodes](/concepts/control-flow). |

## Basics

```tsx
<Worktree path="/tmp/smithers/wt-a">
  <Task id="build" output={outputs.outputC}>{{ value: 1 }}</Task>
  <Task id="test" output={outputs.outputC}>{{ value: 2 }}</Task>
  <MergeQueue>
    <Task id="apply" output={outputs.outputC}>{{ value: 3 }}</Task>
  </MergeQueue>
  <Parallel maxConcurrency={2}>
    <Task id="lint" output={outputs.outputC}>{{ value: 4 }}</Task>
  </Parallel>
  <Task id="package" output={outputs.outputC}>{{ value: 5 }}</Task>
  <Task id="release" output={outputs.outputC}>{{ value: 6 }}</Task>
</Worktree>
```

Descendant [tasks](/components/task) receive `worktreeId` and a normalized absolute `worktreePath` in their descriptors. The engine uses `worktreePath` as `cwd` for JJ operations and tool execution.

## Path Resolution

| Input | Behavior |
| --- | --- |
| Relative path | Resolves against `baseRootDir` (or `process.cwd()`). |
| Absolute path | Preserved and normalized. |
| Empty/whitespace | Rejected: `<Worktree> requires a non-empty path prop`. |

## Nesting

Innermost `<Worktree>` in scope determines a task's effective `worktreeId`/`worktreePath`.

```tsx
<Worktree id="outer" path="/tmp/wt-outer">
  <Task id="a" output={outputs.outputC}>{{ value: "outer" }}</Task>
  <Worktree id="inner" path="./nested">
    <Task id="b" output={outputs.outputC}>{{ value: "inner" }}</Task>
  </Worktree>
</Worktree>
```

## With [`<Parallel>`](/components/parallel) and [`<MergeQueue>`](/components/merge-queue)

```tsx
<Worktree path="/tmp/wt-queue">
  <MergeQueue>
    {prs.map((pr) => (
      <Task key={pr.id} id={`apply-${pr.id}`} output={outputs.outputC}>
        {{ value: pr.id }}
      </Task>
    ))}
  </MergeQueue>
</Worktree>
```

## Internals

- Renders to `<smithers:worktree>`.
- Extraction assigns `worktreeId` and `worktreePath` to every descendant task descriptor.
- The scheduler is unaware of worktrees; the engine consumes these fields to scope JJ operations and reverts.

## Notes

- Duplicate `id` values are rejected.
- Use `baseBranch` to create/rebase from something other than `main`.
- Prefer absolute ephemeral paths in CI; relative paths for local portability.

---

## <Voice>

> Wrap a subtree with voice I/O capabilities using a VoiceProvider.
> Source: https://smithers.sh/components/voice

## Import

```tsx
import { Voice, Task } from "smithers-orchestrator";
```

## Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `provider` | `VoiceProvider` | -- | Voice provider instance. Required. |
| `speaker` | `string` | `undefined` | Default speaker/voice ID for TTS within this subtree. |
| `children` | `ReactNode` | -- | Nested tasks and control-flow nodes. |

## Basics

```tsx
<Voice provider={voice} speaker="alloy">
  <Task id="transcribe" output={outputs.transcript} agent={myAgent}>
    Transcribe the audio and return the text.
  </Task>
</Voice>
```

Descendant tasks receive `voice` and `voiceSpeaker` on their descriptors. The engine uses these fields to invoke voice operations around agent execution.

## Nesting

Innermost `<Voice>` in scope determines a task's effective voice provider:

```tsx
<Voice provider={openaiVoice} speaker="alloy">
  <Task id="a" output={outputs.out}>Uses openaiVoice with alloy</Task>
  <Voice provider={elevenLabsVoice} speaker="rachel">
    <Task id="b" output={outputs.out}>Uses elevenLabsVoice with rachel</Task>
  </Voice>
</Voice>
```

## With Other Components

`<Voice>` composes with all existing control-flow components:

```tsx
<Voice provider={voice}>
  <Parallel>
    <Task id="transcribe-en" output={outputs.en} agent={agent}>
      Transcribe the English audio.
    </Task>
    <Task id="transcribe-fr" output={outputs.fr} agent={agent}>
      Transcribe the French audio.
    </Task>
  </Parallel>
</Voice>
```

## Internals

- Renders to `<smithers:voice>`.
- Extraction assigns `voice` and `voiceSpeaker` to every descendant task descriptor via a `voiceStack` pattern, matching how `worktreeStack` and `parallelStack` work.
- The scheduler is unaware of voice; the engine consumes these fields at task execution time.

---

## <Kanban>

> Process items through columns with pluggable ticket source. Triage, work, review loop.
> Source: https://smithers.sh/components/kanban

```tsx
import { Kanban } from "smithers-orchestrator";
```

## Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `id` | `string` | `"kanban"` | ID prefix for all generated task and loop elements. |
| `columns` | `ColumnDef[]` | **(required)** | Column definitions in order. Items flow left to right through each column. |
| `useTickets` | `() => Array<{ id: string }>` | **(required)** | Function that returns ticket items to process. Each item must have an `id` field. |
| `agents` | `Record<string, AgentLike>` | `undefined` | Record mapping column names to agents. Overrides column-level agents. |
| `maxConcurrency` | `number` | `Infinity` | Max items processed in parallel per column. |
| `onComplete` | `OutputTarget` | `undefined` | Output schema for the completion task when items reach the final column. |
| `until` | `boolean` | `false` | Loop exit condition. When `true`, the board loop stops. |
| `maxIterations` | `number` | `5` | Max iterations through the column pipeline. |
| `skipIf` | `boolean` | `false` | Skip the entire board. Returns `null`. |
| `children` | `ReactNode` | `undefined` | Content passed to the `onComplete` task, if present. |

### ColumnDef

| Field | Type | Description |
| --- | --- | --- |
| `name` | `string` | Column name (e.g., `"backlog"`, `"review"`). |
| `agent` | `AgentLike` | Agent that processes items in this column. |
| `output` | `OutputTarget` | Output schema for tasks in this column. |
| `prompt` | `(ctx: { item, column }) => string` | Optional prompt template. Receives the item and column name. |
| `task` | `Partial<TaskProps>` | Optional task overrides applied to each generated item task in the column. Use this to set `retries`, `timeoutMs`, `heartbeatTimeoutMs`, or override `continueOnFail`. |

## Basic usage

```tsx
const columns = [
  { name: "triage", agent: triageAgent, output: outputs.triage },
  { name: "work", agent: workerAgent, output: outputs.work },
  { name: "review", agent: reviewAgent, output: outputs.review },
];

<Workflow name="ticket-board">
  <Kanban
    columns={columns}
    useTickets={() => tickets}
    until={allDone}
    maxIterations={3}
  />
</Workflow>
```

## With concurrency limits

```tsx
<Kanban
  id="pr-queue"
  columns={columns}
  useTickets={() => pullRequests}
  maxConcurrency={2}
  until={queueEmpty}
/>
```

At most two items are processed simultaneously within each column.

## Overriding agents per column

The `agents` prop overrides column-level agents:

```tsx
<Kanban
  columns={columns}
  useTickets={() => tickets}
  agents={{
    review: seniorReviewAgent,
    work: juniorDevAgent,
  }}
  until={done}
/>
```

## With completion handler

When `onComplete` is provided, a final task runs after the loop exits:

```tsx
<Kanban
  columns={columns}
  useTickets={() => tickets}
  onComplete={outputs.boardSummary}
  until={allDone}
>
  Summarize the board results.
</Kanban>
```

## Custom prompts per column

```tsx
const columns = [
  {
    name: "triage",
    agent: triageAgent,
    output: outputs.triage,
    prompt: ({ item }) => `Triage ticket: ${item.title}\n${item.description}`,
  },
  {
    name: "implement",
    agent: codeAgent,
    output: outputs.impl,
    prompt: ({ item }) => `Implement the fix for: ${item.title}`,
  },
];
```

## Per-column task policy

Use `task` when a lane needs explicit retries or different runtime limits:

```tsx
const columns = [
  { name: "triage", agent: triageAgent, output: outputs.triage },
  {
    name: "work",
    agent: workerAgent,
    output: outputs.work,
    task: {
      retries: 2,
      timeoutMs: 30_000,
    },
  },
];
```

## Structure

`<Kanban>` composes existing primitives. It does not create a new host element type. The rendered tree looks like:

```
Loop (until / maxIterations)
  Sequence
    Parallel (column 1 — all items)
    Parallel (column 2 — all items)
    ...
    Task (onComplete, if provided)
```

## Notes

- `<Kanban>` is a composite component. It renders a tree of `<Loop>`, `<Sequence>`, `<Parallel>`, and `<Task>` elements.
- Each column creates a `<Parallel>` block where all ticket items are processed concurrently (bounded by `maxConcurrency`).
- Generated item tasks default to `continueOnFail={true}` so one item does not block the rest of the board. Use `column.task` to add retries or override that behavior.
- The `useTickets` function is called at render time. Return different items each iteration to implement dynamic ticket sources.
- Use `until` with `ctx.outputMaybe()` to exit the loop when all items reach the final column.

---

## <ClassifyAndRoute>

> Classify items into categories then route each to a category-specific agent in parallel.
> Source: https://smithers.sh/components/classify-and-route

```tsx
import { ClassifyAndRoute } from "smithers-orchestrator";
```

## Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `id` | `string` | `"classify-and-route"` | ID prefix for all generated task elements. |
| `items` | `unknown \| unknown[]` | **(required)** | Items to classify. A single item or array. |
| `categories` | `Record<string, AgentLike \| CategoryConfig>` | **(required)** | Maps category names to agents or config objects. |
| `classifierAgent` | `AgentLike` | **(required)** | Agent that classifies items into categories. |
| `classifierOutput` | `OutputTarget` | **(required)** | Output schema for the classification task. |
| `routeOutput` | `OutputTarget` | **(required)** | Default output schema for routed work. |
| `classificationResult` | `{ classifications: Array<{ category, itemId? }> } \| null` | `undefined` | Classification result used to drive routing. Typically from `ctx.outputMaybe()`. |
| `maxConcurrency` | `number` | `Infinity` | Max parallel route handlers. |
| `skipIf` | `boolean` | `false` | Skip the entire classify-and-route block. Returns `null`. |
| `children` | `ReactNode` | `undefined` | Custom prompt content for the classification task. |

### CategoryConfig

| Field | Type | Description |
| --- | --- | --- |
| `agent` | `AgentLike` | Agent that handles items in this category. |
| `output` | `OutputTarget` | Optional output schema override for this category. |
| `prompt` | `(item) => string` | Optional prompt template for the route handler. |

## Basic usage

```tsx
const classification = ctx.outputMaybe(outputs.classification, {
  nodeId: "classify-and-route-classify",
});

<Workflow name="support-router">
  <ClassifyAndRoute
    items={ctx.input.tickets}
    categories={{
      billing: billingAgent,
      support: supportAgent,
      sales: salesAgent,
    }}
    classifierAgent={classifierAgent}
    classifierOutput={outputs.classification}
    routeOutput={outputs.handled}
    classificationResult={classification}
  />
</Workflow>
```

## With category configs

Pass config objects instead of bare agents for per-category output schemas and prompts:

```tsx
<ClassifyAndRoute
  items={messages}
  categories={{
    urgent: {
      agent: urgentHandler,
      output: outputs.urgentResult,
      prompt: (item) => `URGENT: Handle immediately.\n${JSON.stringify(item)}`,
    },
    normal: {
      agent: normalHandler,
      output: outputs.normalResult,
    },
  }}
  classifierAgent={classifierAgent}
  classifierOutput={outputs.classification}
  routeOutput={outputs.defaultResult}
  classificationResult={classification}
/>
```

## Custom classifier prompt

Use `children` to provide a custom prompt for the classification task:

```tsx
<ClassifyAndRoute
  items={ctx.input.emails}
  categories={categoryMap}
  classifierAgent={classifierAgent}
  classifierOutput={outputs.classification}
  routeOutput={outputs.handled}
  classificationResult={classification}
>
  Classify each email by department. Use "engineering" for bug reports,
  "sales" for pricing questions, "hr" for internal requests.
</ClassifyAndRoute>
```

## Limiting concurrency

```tsx
<ClassifyAndRoute
  items={ctx.input.items}
  categories={categories}
  classifierAgent={classifierAgent}
  classifierOutput={outputs.classification}
  routeOutput={outputs.result}
  classificationResult={classification}
  maxConcurrency={3}
/>
```

At most three route handlers run simultaneously.

## Structure

`<ClassifyAndRoute>` composes existing primitives. The rendered tree looks like:

```
Sequence
  Task (classifier — assigns categories)
  Parallel (route handlers — one per classified item)
    Task (route handler for item 1)
    Task (route handler for item 2)
    ...
```

## Two-phase rendering

The component uses a two-phase approach driven by Smithers reactivity:

1. **First render**: The classification task runs. `classificationResult` is `null`, so no routes are mounted.
2. **Re-render**: After classification completes, pass the result via `classificationResult` (typically `ctx.outputMaybe()`). The route handlers are now mounted and execute in parallel.

This pattern follows standard Smithers data-dependent control flow.

## Notes

- `<ClassifyAndRoute>` is a composite component. It renders `<Sequence>`, `<Task>`, and `<Parallel>` elements.
- The `classificationResult` prop drives routing. Each entry's `category` field must match a key in `categories`.
- Unrecognized categories (no matching key in `categories`) are silently skipped.
- Route tasks have `continueOnFail` enabled by default so a single handler failure does not block others.

---

## <GatherAndSynthesize>

> Parallel data collection from different sources followed by synthesis into a unified result.
> Source: https://smithers.sh/components/gather-and-synthesize

```tsx
import { GatherAndSynthesize } from "smithers-orchestrator";
```

## Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `id` | `string` | `"gather-and-synthesize"` | ID prefix for all generated task elements. |
| `sources` | `Record<string, SourceDef>` | **(required)** | Maps source names to source definitions with agent, prompt, and optional output. |
| `synthesizer` | `AgentLike` | **(required)** | Agent that synthesizes gathered data into a unified result. |
| `gatherOutput` | `OutputTarget` | **(required)** | Default output schema for each source gather task. |
| `synthesisOutput` | `OutputTarget` | **(required)** | Output schema for the synthesis task. |
| `gatheredResults` | `Record<string, unknown> \| null` | `undefined` | Gathered results keyed by source name. Passed to the synthesis prompt. Typically from `ctx.outputMaybe()`. |
| `maxConcurrency` | `number` | `Infinity` | Max parallel gatherers. |
| `synthesisPrompt` | `string` | auto-generated | Custom prompt for the synthesis task. |
| `skipIf` | `boolean` | `false` | Skip the entire gather-and-synthesize block. Returns `null`. |
| `children` | `ReactNode` | `undefined` | Custom content for the synthesis task. Overrides `synthesisPrompt`. |

### SourceDef

| Field | Type | Description |
| --- | --- | --- |
| `agent` | `AgentLike` | Agent that gathers data from this source. |
| `prompt` | `string` | Prompt for the gather task. |
| `output` | `OutputTarget` | Optional output schema override for this source. |
| `children` | `ReactNode` | Optional ReactNode content for the gather task. Overrides `prompt`. |

## Basic usage

```tsx
<Workflow name="research">
  <GatherAndSynthesize
    sources={{
      docs: { agent: docsAgent, prompt: "Search the documentation." },
      code: { agent: codeAgent, prompt: "Analyze the codebase." },
      issues: { agent: issueAgent, prompt: "Review open issues." },
    }}
    synthesizer={synthesisAgent}
    gatherOutput={outputs.gathered}
    synthesisOutput={outputs.synthesis}
    gatheredResults={gathered}
  />
</Workflow>
```

## With per-source output schemas

```tsx
<GatherAndSynthesize
  sources={{
    metrics: {
      agent: metricsAgent,
      prompt: "Collect performance metrics.",
      output: outputs.metrics,
    },
    logs: {
      agent: logAgent,
      prompt: "Analyze recent error logs.",
      output: outputs.logs,
    },
    alerts: {
      agent: alertAgent,
      prompt: "Fetch active alerts.",
      output: outputs.alerts,
    },
  }}
  synthesizer={incidentAgent}
  gatherOutput={outputs.defaultGather}
  synthesisOutput={outputs.incidentReport}
  gatheredResults={gathered}
/>
```

## Custom synthesis prompt

```tsx
<GatherAndSynthesize
  sources={sources}
  synthesizer={synthesisAgent}
  gatherOutput={outputs.gathered}
  synthesisOutput={outputs.report}
  gatheredResults={gathered}
  synthesisPrompt="Combine all research findings into an executive summary with recommendations."
/>
```

Or use `children` for richer content:

```tsx
<GatherAndSynthesize
  sources={sources}
  synthesizer={synthesisAgent}
  gatherOutput={outputs.gathered}
  synthesisOutput={outputs.report}
  gatheredResults={gathered}
>
  Write a comprehensive report combining all gathered data.
  Focus on actionable insights and prioritized recommendations.
</GatherAndSynthesize>
```

## Limiting concurrency

```tsx
<GatherAndSynthesize
  sources={manySources}
  synthesizer={synthesisAgent}
  gatherOutput={outputs.gathered}
  synthesisOutput={outputs.synthesis}
  gatheredResults={gathered}
  maxConcurrency={3}
/>
```

At most three source agents gather data simultaneously.

## Structure

`<GatherAndSynthesize>` composes existing primitives. The rendered tree looks like:

```
Sequence
  Parallel (gather phase)
    Task (gather from source A)
    Task (gather from source B)
    ...
  Task (synthesis — depends on all gather tasks via needs)
```

## Dependency wiring

The synthesis task automatically receives `needs` entries for each source. Each source name becomes a key in `needs`, pointing to the corresponding gather task ID. This ensures the synthesis task does not run until all gather tasks complete.

## Notes

- `<GatherAndSynthesize>` is a composite component. It renders `<Sequence>`, `<Parallel>`, and `<Task>` elements.
- Each source creates a separate gather `<Task>` with its own agent.
- The synthesis `<Task>` uses `needs` to depend on all gather tasks, ensuring it runs after all data is collected.
- When `gatheredResults` is provided, it is formatted into the default synthesis prompt. Pass `synthesisPrompt` or `children` to override.
- Sources with a `children` field use that as the task content, taking priority over `prompt`.

---

## <Panel>

> Parallel specialist agents review the same input, then a moderator synthesizes results with optional voting/quorum strategies.
> Source: https://smithers.sh/components/panel

```tsx
import { Panel } from "smithers-orchestrator";
```

## Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `id` | `string` | `"panel"` | ID prefix for generated task ids. |
| `panelists` | `PanelistConfig[] \| AgentLike[]` | **(required)** | Specialist agents. Each entry is `{ agent, role?, label? }` or a bare `AgentLike`. |
| `moderator` | `AgentLike` | **(required)** | Agent that synthesizes all panelist outputs into a final result. |
| `panelistOutput` | `OutputTarget` | **(required)** | Output schema for each panelist task. |
| `moderatorOutput` | `OutputTarget` | **(required)** | Output schema for the moderator synthesis task. |
| `strategy` | `"synthesize" \| "vote" \| "consensus"` | `"synthesize"` | How the moderator combines results. `"synthesize"` merges freely; `"vote"` counts agreement; `"consensus"` requires convergence. |
| `minAgree` | `number` | `undefined` | Minimum panelists that must agree (used with `"vote"` and `"consensus"` strategies). |
| `maxConcurrency` | `number` | `Infinity` | Maximum panelists running in parallel. |
| `skipIf` | `boolean` | `false` | Skip the entire panel. Returns `null`. |
| `children` | `string \| ReactNode` | **(required)** | Prompt or input sent to every panelist. |

## Basic usage

```tsx
<Workflow name="code-review-panel">
  <Panel
    panelists={[
      { agent: securityAgent, role: "Security Reviewer" },
      { agent: qualityAgent, role: "Code Quality Reviewer" },
      { agent: architectureAgent, role: "Architecture Reviewer" },
    ]}
    moderator={moderatorAgent}
    panelistOutput={outputs.review}
    moderatorOutput={outputs.synthesis}
  >
    Review the changes in src/auth/ for security, quality, and architecture concerns.
  </Panel>
</Workflow>
```

This renders as:

1. Three panelist tasks run in parallel, each receiving the same prompt.
2. A moderator task runs after all panelists complete, receiving their outputs via `needs`.

## Voting strategy

Use `strategy="vote"` with `minAgree` to require quorum:

```tsx
<Panel
  panelists={[
    { agent: reviewer1, role: "Reviewer A" },
    { agent: reviewer2, role: "Reviewer B" },
    { agent: reviewer3, role: "Reviewer C" },
  ]}
  moderator={moderatorAgent}
  panelistOutput={outputs.review}
  moderatorOutput={outputs.verdict}
  strategy="vote"
  minAgree={2}
>
  Should we approve this RFC? Evaluate the proposal and vote approve or reject.
</Panel>
```

The moderator receives the vote strategy and minimum agreement threshold in its prompt context.

## Consensus strategy

Use `strategy="consensus"` to require panelists to converge on a shared answer. The moderator checks whether panelists agree and, if `minAgree` is set, enforces a minimum threshold:

```tsx
<Panel
  panelists={[
    { agent: analyst1, role: "Risk Analyst A" },
    { agent: analyst2, role: "Risk Analyst B" },
    { agent: analyst3, role: "Risk Analyst C" },
  ]}
  moderator={moderatorAgent}
  panelistOutput={outputs.assessment}
  moderatorOutput={outputs.consensus}
  strategy="consensus"
  minAgree={3}
>
  Assess the risk level of deploying the new payment gateway to production this week.
</Panel>
```

The moderator receives the consensus strategy and threshold in its prompt context, and is responsible for determining whether the panelists have converged.

## Bare agent shorthand

When you don't need per-panelist roles, pass an array of agents directly:

```tsx
<Panel
  panelists={[agent1, agent2, agent3]}
  moderator={moderatorAgent}
  panelistOutput={outputs.review}
  moderatorOutput={outputs.synthesis}
>
  Analyze the quarterly report for discrepancies.
</Panel>
```

Each agent is auto-labeled `panelist-0`, `panelist-1`, etc.

## Limiting concurrency

```tsx
<Panel
  panelists={specialists}
  moderator={moderatorAgent}
  panelistOutput={outputs.review}
  moderatorOutput={outputs.synthesis}
  maxConcurrency={2}
>
  Review the deployment plan.
</Panel>
```

At most two panelists run at a time. The rest queue until a slot opens.

## Generated structure

`<Panel>` is a composite component. It does not create a new host element type. Internally it renders:

```
Sequence
  Parallel (maxConcurrency)
    Task (panelist 0)
    Task (panelist 1)
    ...
  Task (moderator, needs: all panelist ids)
```

## Notes

- Each panelist task id is `{prefix}-{label|role|panelist-N}`. The moderator task id is `{prefix}-moderator`.
- The moderator task uses `needs` to depend on all panelist tasks, so it runs only after every panelist completes.
- `strategy` and `minAgree` are passed as prompt context to the moderator. The moderator agent is responsible for interpreting and applying the strategy.
- Panelist outputs all write to the same `panelistOutput` schema, differentiated by their task id.

---

## <CheckSuite>

> Parallel checks with auto-aggregated pass/fail verdict.
> Source: https://smithers.sh/components/check-suite

```tsx
import { CheckSuite } from "smithers-orchestrator";
```

## Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `id` | `string` | `"checksuite"` | ID prefix for generated task ids. |
| `checks` | `CheckConfig[] \| Record<string, CheckConfig>` | **(required)** | Checks to run. Array of `{ id, agent?, command?, label? }` or an object keyed by check id. |
| `verdictOutput` | `OutputTarget` | **(required)** | Output schema for each check task and the aggregate verdict. |
| `strategy` | `"all-pass" \| "majority" \| "any-pass"` | `"all-pass"` | How individual results aggregate. `"all-pass"`: every check must pass. `"majority"`: more than half must pass. `"any-pass"`: one passing check is enough. |
| `maxConcurrency` | `number` | `Infinity` | Maximum checks running in parallel. |
| `continueOnFail` | `boolean` | `true` | Whether individual check failures stop the suite or allow remaining checks to complete. |
| `skipIf` | `boolean` | `false` | Skip the entire suite. Returns `null`. |

## Basic usage

```tsx
<Workflow name="ci-checks">
  <CheckSuite
    checks={[
      { id: "lint", agent: lintAgent, label: "ESLint" },
      { id: "typecheck", agent: typecheckAgent, label: "TypeScript" },
      { id: "test", agent: testAgent, label: "Unit Tests" },
    ]}
    verdictOutput={outputs.verdict}
  />
</Workflow>
```

This renders as:

1. Three check tasks run in parallel.
2. A verdict task runs after all checks complete, aggregating results into a pass/fail decision.

## Object syntax

Pass checks as a record instead of an array:

```tsx
<CheckSuite
  checks={{
    lint: { agent: lintAgent, label: "ESLint" },
    typecheck: { agent: typecheckAgent, label: "TypeScript" },
    security: { agent: securityAgent, label: "Security Scan" },
  }}
  verdictOutput={outputs.verdict}
  strategy="all-pass"
/>
```

Object keys become check ids automatically.

## Majority strategy

Allow the suite to pass even if some checks fail:

```tsx
<CheckSuite
  checks={[
    { id: "perf", agent: perfAgent, label: "Performance" },
    { id: "a11y", agent: a11yAgent, label: "Accessibility" },
    { id: "seo", agent: seoAgent, label: "SEO" },
  ]}
  verdictOutput={outputs.verdict}
  strategy="majority"
/>
```

The verdict task receives the `"majority"` strategy and aggregates accordingly.

## Any-pass strategy

Use `"any-pass"` when a single passing check is sufficient:

```tsx
<CheckSuite
  checks={[
    { id: "region-us", agent: healthAgent, label: "US region" },
    { id: "region-eu", agent: healthAgent, label: "EU region" },
    { id: "region-ap", agent: healthAgent, label: "AP region" },
  ]}
  verdictOutput={outputs.verdict}
  strategy="any-pass"
/>
```

The suite passes as long as at least one region is healthy.

## Command-based checks

Checks can use `command` instead of `agent` for shell-based checks:

```tsx
<CheckSuite
  checks={[
    { id: "lint", command: "npm run lint", label: "Lint" },
    { id: "typecheck", command: "npx tsc --noEmit", label: "Type check" },
    { id: "test", command: "npm test", label: "Unit tests" },
  ]}
  verdictOutput={outputs.verdict}
/>
```

## Fail-fast mode

Set `continueOnFail={false}` to stop the suite as soon as any check fails:

```tsx
<CheckSuite
  checks={checks}
  verdictOutput={outputs.verdict}
  continueOnFail={false}
/>
```

## Limiting concurrency

```tsx
<CheckSuite
  checks={checks}
  verdictOutput={outputs.verdict}
  maxConcurrency={3}
/>
```

At most three checks run at a time.

## Generated structure

`<CheckSuite>` is a composite component. It does not create a new host element type. Internally it renders:

```
Sequence
  Parallel (maxConcurrency)
    Task (check 0, continueOnFail)
    Task (check 1, continueOnFail)
    ...
  Task (verdict, needs: all check ids)
```

## Notes

- Each check task id is `{prefix}-{checkId}`. The verdict task id is `{prefix}-verdict`.
- The verdict task uses `needs` to depend on all check tasks.
- `strategy` is passed as prompt context to the verdict aggregation task. When using agent-based checks, the aggregation logic depends on the verdict agent interpreting the strategy.
- When `continueOnFail` is `true` (default), all checks run to completion even if some fail. The verdict task can then inspect which checks passed or failed.

---

## <Debate>

> Adversarial multi-round debate between a proposer and opponent, followed by a judge verdict.
> Source: https://smithers.sh/components/debate

```tsx
import { Debate } from "smithers-orchestrator";
```

## Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `id` | `string` | `"debate"` | ID prefix for generated task ids. |
| `proposer` | `AgentLike` | **(required)** | Agent arguing FOR the topic. |
| `opponent` | `AgentLike` | **(required)** | Agent arguing AGAINST the topic. |
| `judge` | `AgentLike` | **(required)** | Agent rendering the final verdict after all rounds. |
| `rounds` | `number` | `2` | Number of debate rounds. Each round has both sides arguing in parallel. |
| `argumentOutput` | `OutputTarget` | **(required)** | Output schema for proposer and opponent argument tasks. |
| `verdictOutput` | `OutputTarget` | **(required)** | Output schema for the judge verdict task. |
| `topic` | `string \| ReactNode` | **(required)** | The debate topic. Passed to all participants. |
| `skipIf` | `boolean` | `false` | Skip the entire debate. Returns `null`. |

## Basic usage

```tsx
<Workflow name="architecture-debate">
  <Debate
    proposer={monolithAdvocate}
    opponent={microservicesAdvocate}
    judge={architectureJudge}
    rounds={3}
    argumentOutput={outputs.argument}
    verdictOutput={outputs.verdict}
    topic="Should we migrate from a monolith to microservices for the payments system?"
  />
</Workflow>
```

This renders as:

1. A loop running for 3 rounds.
2. Each round: proposer and opponent argue in parallel.
3. After all rounds: the judge reviews all arguments and renders a verdict.

## Two-round default

Omit `rounds` for the default two-round debate:

```tsx
<Debate
  proposer={proAgent}
  opponent={conAgent}
  judge={judgeAgent}
  argumentOutput={outputs.argument}
  verdictOutput={outputs.verdict}
  topic="Should we adopt GraphQL over REST for our public API?"
/>
```

## Technology selection

```tsx
<Debate
  proposer={rustAdvocate}
  opponent={goAdvocate}
  judge={techLeadAgent}
  rounds={2}
  argumentOutput={outputs.argument}
  verdictOutput={outputs.verdict}
  topic={`Evaluate Rust vs Go for the new CLI tool.
Requirements: fast startup, cross-compilation, small binary size.`}
/>
```

## MDX topic

The `topic` prop accepts ReactNode, so you can use MDX prompts:

```tsx
<Debate
  proposer={proAgent}
  opponent={conAgent}
  judge={judgeAgent}
  argumentOutput={outputs.argument}
  verdictOutput={outputs.verdict}
  topic={<DebateTopicPrompt context={ctx.input.context} />}
/>
```

## Generated structure

`<Debate>` is a composite component. It does not create a new host element type. Internally it renders:

```
Sequence
  Loop (maxIterations=rounds)
    Sequence
      Parallel
        Task (proposer)
        Task (opponent)
  Task (judge, needs: proposer + opponent)
```

## Notes

- The proposer task id is `{prefix}-proposer`, the opponent is `{prefix}-opponent`, and the judge is `{prefix}-judge`.
- The loop id is `{prefix}-loop`. It runs for exactly `rounds` iterations using `maxIterations` with `onMaxReached="return-last"`.
- Both the proposer and opponent write to the same `argumentOutput` schema, differentiated by task id.
- The judge task uses `needs` to depend on both the proposer and opponent tasks, receiving all round outputs.
- For more control over rebuttals and per-round prompt customization, compose `Loop`, `Parallel`, and `Task` directly as shown in the `examples/debate.tsx` example.

---

## <ReviewLoop>

> Produce, review, fix, and repeat until approved. A composite component that wires Loop, Sequence, and Task into a standard review-cycle pattern.
> Source: https://smithers.sh/components/review-loop

```tsx
import { ReviewLoop } from "smithers-orchestrator";
```

## Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `id` | `string` | `"review-loop"` | ID prefix. Task ids are derived as `{id}-produce` and `{id}-review`. |
| `producer` | `AgentLike` | **(required)** | Agent that produces or fixes the work each iteration. |
| `reviewer` | `AgentLike \| AgentLike[]` | **(required)** | Agent (or agents) that reviews the produced work. |
| `produceOutput` | `OutputTarget` | **(required)** | Output schema for the produced work. |
| `reviewOutput` | `OutputTarget` | **(required)** | Output schema for the review. Must include an `approved: boolean` field. |
| `maxIterations` | `number` | `5` | Maximum review cycles before stopping. |
| `onMaxReached` | `"return-last" \| "fail"` | `"return-last"` | Behavior when max iterations is reached. |
| `skipIf` | `boolean` | `false` | Skip the entire review loop. Returns `null`. |
| `children` | `string \| ReactNode` | **(required)** | Initial prompt for the producer. |

## Basic usage

```tsx
import { createSmithers } from "smithers-orchestrator";
import { ToolLoopAgent as Agent } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
import { z } from "zod";

const codeSchema = z.object({
  files: z.array(z.string()),
  summary: z.string(),
});

const reviewSchema = z.object({
  approved: z.boolean(),
  feedback: z.string(),
  issues: z.array(z.string()),
});

const { Workflow, smithers, outputs } = createSmithers({
  code: codeSchema,
  review: reviewSchema,
});

const coder = new Agent({
  model: anthropic("claude-sonnet-4-20250514"),
  instructions: "You are a senior developer.",
});

const reviewer = new Agent({
  model: anthropic("claude-sonnet-4-20250514"),
  instructions: "You are a strict code reviewer.",
});

export default smithers(() => (
  <Workflow name="code-review">
    <ReviewLoop
      producer={coder}
      reviewer={reviewer}
      produceOutput={outputs.code}
      reviewOutput={outputs.review}
      maxIterations={3}
    >
      Implement a REST API for user authentication with JWT tokens.
    </ReviewLoop>
  </Workflow>
));
```

## Multiple reviewers

Pass an array of agents to `reviewer`. The runtime uses the standard agent fallback chain:

```tsx
<ReviewLoop
  producer={writer}
  reviewer={[securityReviewer, styleReviewer]}
  produceOutput={outputs.draft}
  reviewOutput={outputs.review}
>
  Write a security policy document.
</ReviewLoop>
```

## Fail on max iterations

```tsx
<ReviewLoop
  producer={coder}
  reviewer={reviewer}
  produceOutput={outputs.code}
  reviewOutput={outputs.review}
  maxIterations={5}
  onMaxReached="fail"
>
  Implement the payment processing module.
</ReviewLoop>
```

When `onMaxReached` is `"fail"`, the workflow fails if the reviewer has not approved after the maximum number of iterations.

## What it expands to

`<ReviewLoop>` is a composite component. It renders this tree:

```tsx
<Loop id={id} until={false} maxIterations={maxIterations} onMaxReached={onMaxReached}>
  <Sequence>
    <Task id="{id}-produce" output={produceOutput} agent={producer}>
      {children}
    </Task>
    <Task id="{id}-review" output={reviewOutput} agent={reviewer} needs={{ produced: "{id}-produce" }}>
      Review the produced work and decide whether to approve.
    </Task>
  </Sequence>
</Loop>
```

The runtime reads `reviewOutput` for the `approved` field each frame and exits the loop when `approved` is `true`.

## Notes

- The `reviewOutput` schema must include an `approved: boolean` field. The runtime uses this to determine when to exit the loop.
- On subsequent iterations the producer receives the reviewer's feedback through the loop's re-render cycle.
- Task ids are derived from the `id` prop: `{id}-produce` and `{id}-review`.
- Access iteration outputs using `ctx.latest()` or `ctx.outputs` in the parent workflow.

---

## <Optimizer>

> Generate, evaluate, and improve in a loop with score convergence. A composite component that wires Loop, Sequence, and Task into an iterative optimization pattern.
> Source: https://smithers.sh/components/optimizer

```tsx
import { Optimizer } from "smithers-orchestrator";
```

## Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `id` | `string` | `"optimizer"` | ID prefix. Task ids are derived as `{id}-generate` and `{id}-evaluate`. |
| `generator` | `AgentLike` | **(required)** | Agent that generates or improves candidates each iteration. |
| `evaluator` | `AgentLike \| Function` | **(required)** | Agent or compute function that scores candidates. |
| `generateOutput` | `OutputTarget` | **(required)** | Output schema for generated candidates. |
| `evaluateOutput` | `OutputTarget` | **(required)** | Output schema for evaluation results. Must include a `score: number` field. |
| `targetScore` | `number` | `undefined` | Score threshold to stop early. When omitted, runs all iterations. |
| `maxIterations` | `number` | `10` | Maximum optimization rounds. |
| `onMaxReached` | `"return-last" \| "fail"` | `"return-last"` | Behavior when max iterations is reached. |
| `skipIf` | `boolean` | `false` | Skip the entire optimization loop. Returns `null`. |
| `children` | `string \| ReactNode` | **(required)** | Initial generation prompt. |

## Basic usage

```tsx
import { createSmithers } from "smithers-orchestrator";
import { ToolLoopAgent as Agent } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
import { z } from "zod";

const promptSchema = z.object({
  promptText: z.string(),
  reasoning: z.string(),
});

const evalSchema = z.object({
  score: z.number().min(0).max(100),
  feedback: z.string(),
  strengths: z.array(z.string()),
  weaknesses: z.array(z.string()),
});

const { Workflow, smithers, outputs } = createSmithers({
  prompt: promptSchema,
  evaluation: evalSchema,
});

const promptEngineer = new Agent({
  model: anthropic("claude-sonnet-4-20250514"),
  instructions: "You are a prompt engineer. Generate clear, effective prompts.",
});

const evaluator = new Agent({
  model: anthropic("claude-sonnet-4-20250514"),
  instructions: "Score prompts from 0-100 on clarity, specificity, and effectiveness.",
});

export default smithers(() => (
  <Workflow name="prompt-optimizer">
    <Optimizer
      generator={promptEngineer}
      evaluator={evaluator}
      generateOutput={outputs.prompt}
      evaluateOutput={outputs.evaluation}
      targetScore={90}
      maxIterations={5}
    >
      Generate a prompt for summarizing legal documents.
    </Optimizer>
  </Workflow>
));
```

## Compute evaluator

When the evaluator is deterministic (no LLM needed), pass a function instead of an agent. The function receives the candidate and returns the evaluation:

```tsx
<Optimizer
  generator={copywriter}
  evaluator={(candidate) => ({
    score: candidate.text.length > 100 ? 85 : 40,
    feedback: candidate.text.length > 100
      ? "Good length"
      : "Too short, expand the copy",
  })}
  generateOutput={outputs.copy}
  evaluateOutput={outputs.eval}
  targetScore={80}
>
  Write marketing copy for a developer tool.
</Optimizer>
```

## Run all iterations

Omit `targetScore` to run through all `maxIterations` and keep the best result:

```tsx
<Optimizer
  generator={designer}
  evaluator={critic}
  generateOutput={outputs.design}
  evaluateOutput={outputs.critique}
  maxIterations={8}
>
  Design a landing page layout for a SaaS product.
</Optimizer>
```

## What it expands to

`<Optimizer>` is a composite component. It renders this tree:

```tsx
<Loop id={id} until={false} maxIterations={maxIterations} onMaxReached={onMaxReached}>
  <Sequence>
    <Task id="{id}-generate" output={generateOutput} agent={generator}>
      {children}
    </Task>
    <Task id="{id}-evaluate" output={evaluateOutput} agent={evaluator} needs={{ candidate: "{id}-generate" }}>
      Evaluate the generated candidate and provide a score.
    </Task>
  </Sequence>
</Loop>
```

The runtime reads `evaluateOutput` for the `score` field each frame and exits the loop when the score meets `targetScore`.

## Notes

- The `evaluateOutput` schema must include a `score: number` field. The runtime uses this to check convergence against `targetScore`.
- When `evaluator` is a function, the Task renders as a compute task rather than an agent task.
- Each iteration receives the previous evaluation's feedback through the loop's re-render cycle.
- Task ids are derived from the `id` prop: `{id}-generate` and `{id}-evaluate`.

---

## <ContentPipeline>

> Progressive content refinement through explicit stages. A composite component that wires Sequence and Task into a typed waterfall pipeline.
> Source: https://smithers.sh/components/content-pipeline

```tsx
import { ContentPipeline } from "smithers-orchestrator";
```

## Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `id` | `string` | `undefined` | Optional ID prefix for the pipeline. |
| `stages` | `ContentPipelineStage[]` | **(required)** | Array of stage definitions executed in order. |
| `skipIf` | `boolean` | `false` | Skip the entire pipeline. Returns `null`. |
| `children` | `string \| ReactNode` | **(required)** | Initial prompt/content for the first stage. |

### ContentPipelineStage

| Field | Type | Description |
| --- | --- | --- |
| `id` | `string` | Unique identifier for this stage. Becomes the Task `id`. |
| `agent` | `AgentLike` | Agent that performs this stage's work. |
| `output` | `OutputTarget` | Output schema for this stage. |
| `label` | `string` | Optional human-readable label for the stage. |

## Basic usage

```tsx
import { createSmithers } from "smithers-orchestrator";
import { ToolLoopAgent as Agent } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
import { z } from "zod";

const outlineSchema = z.object({
  sections: z.array(z.string()),
  wordCount: z.number(),
});

const draftSchema = z.object({
  content: z.string(),
  wordCount: z.number(),
});

const editedSchema = z.object({
  content: z.string(),
  changes: z.array(z.string()),
});

const { Workflow, smithers, outputs } = createSmithers({
  outline: outlineSchema,
  draft: draftSchema,
  edited: editedSchema,
});

const outliner = new Agent({
  model: anthropic("claude-sonnet-4-20250514"),
  instructions: "Create structured outlines for blog posts.",
});

const writer = new Agent({
  model: anthropic("claude-sonnet-4-20250514"),
  instructions: "Write engaging blog content from outlines.",
});

const editor = new Agent({
  model: anthropic("claude-sonnet-4-20250514"),
  instructions: "Edit for clarity, grammar, and style.",
});

export default smithers(() => (
  <Workflow name="blog-pipeline">
    <ContentPipeline
      stages={[
        { id: "outline", agent: outliner, output: outputs.outline, label: "Create outline" },
        { id: "draft", agent: writer, output: outputs.draft, label: "Write draft" },
        { id: "edit", agent: editor, output: outputs.edited, label: "Edit and polish" },
      ]}
    >
      Write a blog post about building AI workflows with React components.
    </ContentPipeline>
  </Workflow>
));
```

## Two-stage pipeline

A minimal pipeline with just two stages:

```tsx
<ContentPipeline
  stages={[
    { id: "translate", agent: translator, output: outputs.translation, label: "Translate" },
    { id: "review", agent: nativeReviewer, output: outputs.reviewed, label: "Native review" },
  ]}
>
  Translate the product documentation to Japanese.
</ContentPipeline>
```

## Conditional skipping

```tsx
<ContentPipeline
  skipIf={ctx.input.skipEditing}
  stages={[
    { id: "draft", agent: writer, output: outputs.draft },
    { id: "edit", agent: editor, output: outputs.edited },
  ]}
>
  {ctx.input.topic}
</ContentPipeline>
```

## What it expands to

`<ContentPipeline>` is a composite component. For a three-stage pipeline it renders:

```tsx
<Sequence>
  <Task id="outline" output={outputs.outline} agent={outliner} label="Create outline">
    {children}
  </Task>
  <Task id="draft" output={outputs.draft} agent={writer} label="Write draft" needs={{ previous: "outline" }}>
    Continue from the previous stage's output. Perform: Write draft
  </Task>
  <Task id="edit" output={outputs.edited} agent={editor} label="Edit and polish" needs={{ previous: "draft" }}>
    Continue from the previous stage's output. Perform: Edit and polish
  </Task>
</Sequence>
```

Each stage after the first uses `needs` to depend on the previous stage, creating a typed waterfall.

## Notes

- Stages execute in array order. Each stage after the first depends on the previous stage via `needs`.
- The first stage receives `children` as its prompt. Subsequent stages receive a continuation prompt that includes the stage label.
- Stage `id` values must be unique within the pipeline and across the workflow.
- Use `label` to provide descriptive names that appear in the TUI and in continuation prompts.

---

## <ApprovalGate>

> Conditional approval that requires human sign-off only when a condition is true, otherwise auto-approves.
> Source: https://smithers.sh/components/approval-gate

Wraps `<Branch>` + `<Approval>` into a single component. When `when` is `true`, the workflow pauses for human approval. When `false`, a static task auto-approves immediately so downstream nodes can proceed without delay.

## Import

```tsx
import { ApprovalGate } from "smithers-orchestrator";
```

## Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `id` | `string` | **(required)** | Unique node id within the workflow. |
| `output` | `z.ZodObject \| Table \| string` | **(required)** | Where to persist the approval decision. |
| `request` | `{ title: string; summary?: string; metadata?: Record<string, unknown> }` | **(required)** | Human-facing approval request. |
| `when` | `boolean` | **(required)** | When `true`, approval is required. When `false`, auto-approves. |
| `onDeny` | `"fail" \| "continue" \| "skip"` | `"fail"` | Behavior after denial. |
| `skipIf` | `boolean` | `false` | Skip this node entirely. |
| `timeoutMs` | `number` | `undefined` | Max wait in ms. Node fails on timeout. |
| `retries` | `number` | `0` | Retry attempts before failure. |
| `retryPolicy` | `RetryPolicy` | `undefined` | `{ backoff?: "fixed" \| "linear" \| "exponential", initialDelayMs?: number }` |
| `continueOnFail` | `boolean` | `false` | Workflow continues even if this node fails. |

## Basic usage

Gate production deploys on a risk score. Low-risk changes sail through; high-risk changes require a human sign-off.

```tsx
const risk = ctx.output(outputs.riskScore, { nodeId: "risk" });

<Workflow name="deploy-pipeline">
  <Sequence>
    <Task id="risk" output={outputs.riskScore} agent={riskAgent}>
      Assess the risk of deploying this changeset.
    </Task>

    <ApprovalGate
      id="deploy-approval"
      output={outputs.deployDecision}
      when={risk.level === "high"}
      request={{
        title: "Approve high-risk deploy?",
        summary: `Risk score: ${risk.score}/100`,
        metadata: { commit: ctx.input.sha },
      }}
      onDeny="fail"
    />

    <Task id="deploy" output={outputs.deploy}>
      {{ deployed: true }}
    </Task>
  </Sequence>
</Workflow>
```

## Auto-approve on dry run

```tsx
<ApprovalGate
  id="publish-approval"
  output={outputs.publishDecision}
  when={!ctx.input.dryRun}
  request={{ title: "Publish the article?" }}
/>
```

When `dryRun` is `true`, `when` is `false` and the gate emits `{ approved: true, note: "auto-approved" }` without pausing.

## With timeout and retry

```tsx
<ApprovalGate
  id="budget-approval"
  output={outputs.budgetDecision}
  when={estimate.total > 10_000}
  request={{
    title: "Approve budget over $10k?",
    summary: `Estimated cost: $${estimate.total}`,
  }}
  timeoutMs={60 * 60 * 1000}
  retries={1}
  onDeny="continue"
/>
```

## How it works

`<ApprovalGate>` renders a `<Branch>`:

- **`when` is `true`** -- mounts an `<Approval>` node that pauses for human review.
- **`when` is `false`** -- mounts a static `<Task>` that resolves immediately with `{ approved: true, note: "auto-approved", decidedBy: null, decidedAt: null }`.

Both paths write to the same `output`, so downstream nodes can branch on `decision.approved` without caring which path was taken.

## Notes

- The auto-approve path produces a valid `ApprovalDecision` shape, so downstream logic remains uniform.
- Auto-approve timing lives in Smithers' internal approval/event records, not in the durable task output.
- `onDeny` only applies to the human-approval path. The auto-approve path always succeeds.
- Combine with `skipIf` to disable the gate entirely during development.

---

## <EscalationChain>

> Sequential agent escalation with optional human fallback when automated levels are exhausted.
> Source: https://smithers.sh/components/escalation-chain

Runs a series of agents in order. If a level fails or its `escalateIf` predicate returns `true`, the next level takes over. Optionally ends with a human approval fallback.

## Import

```tsx
import { EscalationChain } from "smithers-orchestrator";
```

## Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `id` | `string` | `"escalation"` | ID prefix for generated nodes. |
| `levels` | `EscalationLevel[]` | **(required)** | Ordered escalation levels. |
| `humanFallback` | `boolean` | `false` | Append a human approval node as final escalation. |
| `humanRequest` | `ApprovalRequest` | Auto-generated | Approval request config for the human fallback. |
| `escalationOutput` | `z.ZodObject \| Table \| string` | **(required)** | Output target for escalation-tracking nodes. |
| `skipIf` | `boolean` | `false` | Skip the entire chain. |
| `children` | `ReactNode` | `undefined` | Prompt/input forwarded to each agent level. |

### EscalationLevel

| Field | Type | Description |
| --- | --- | --- |
| `agent` | `AgentLike` | Agent to handle this level. |
| `output` | `z.ZodObject \| Table \| string` | Output target for this level's result. |
| `label` | `string` | Optional display label. |
| `escalateIf` | `(result: any) => boolean` | Predicate on the level's result. Return `true` to escalate. |

## Basic usage

A three-tier support chain: fast model, powerful model, human.

```tsx
<Workflow name="support-ticket">
  <EscalationChain
    id="support"
    escalationOutput={outputs.escalation}
    humanFallback
    humanRequest={{
      title: "Ticket needs human support",
      summary: "Automated agents could not resolve the issue.",
    }}
    levels={[
      {
        agent: fastAgent,
        output: outputs.tier1,
        label: "Tier 1 -- fast model",
        escalateIf: (r) => r.confidence < 0.7,
      },
      {
        agent: powerAgent,
        output: outputs.tier2,
        label: "Tier 2 -- reasoning model",
        escalateIf: (r) => r.confidence < 0.9,
      },
    ]}
  >
    Resolve this customer support ticket: {ctx.input.ticketBody}
  </EscalationChain>
</Workflow>
```

## Without human fallback

```tsx
<EscalationChain
  id="code-review"
  escalationOutput={outputs.reviewEscalation}
  levels={[
    {
      agent: lintAgent,
      output: outputs.lint,
      label: "Lint check",
      escalateIf: (r) => r.issues.length > 0,
    },
    {
      agent: reviewAgent,
      output: outputs.review,
      label: "Deep review",
    },
  ]}
>
  Review the PR diff.
</EscalationChain>
```

## Two-level with custom labels

```tsx
<EscalationChain
  id="triage"
  escalationOutput={outputs.triageEscalation}
  levels={[
    {
      agent: classifierAgent,
      output: outputs.classify,
      label: "Auto-classify",
      escalateIf: (r) => r.category === "unknown",
    },
    {
      agent: seniorAgent,
      output: outputs.seniorClassify,
      label: "Senior classifier",
    },
  ]}
>
  Classify this incoming request.
</EscalationChain>
```

## How it works

`<EscalationChain>` renders a `<Sequence>` containing:

1. **Level 0** -- always runs with `continueOnFail: true`.
2. **Check node** -- a compute Task that records the escalation decision.
3. **Level 1** -- gated by a `<Branch>`, runs only when escalation is triggered.
4. Repeat for each subsequent level.
5. **Human fallback** (optional) -- an `<Approval>` node appended at the end.

Each agent Task uses `continueOnFail` so that failures propagate to the next level rather than halting the workflow.

## Notes

- The `children` prop (prompt text) is forwarded to every agent level, so each agent receives the same input context.
- `escalateIf` is evaluated at task execution time, not at render time.
- If no level resolves successfully and `humanFallback` is `false`, the chain completes with the last level's failure.
- Combine with `skipIf` to bypass the chain entirely.

---

## <DecisionTable>

> Structured deterministic routing that replaces nested Branches with a flat, declarative rule table.
> Source: https://smithers.sh/components/decision-table

Maps a list of `{ when, then }` rules to workflow elements. Replaces deeply nested `<Branch>` trees with a readable, policy-like table.

## Import

```tsx
import { DecisionTable } from "smithers-orchestrator";
```

## Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `id` | `string` | `undefined` | ID prefix for wrapper nodes. |
| `rules` | `DecisionRule[]` | **(required)** | Ordered list of rules. |
| `default` | `ReactElement` | `undefined` | Fallback element when no rules match. |
| `strategy` | `"first-match" \| "all-match"` | `"first-match"` | `"first-match"`: first matching rule wins. `"all-match"`: all matching rules run in parallel. |
| `skipIf` | `boolean` | `false` | Skip the entire table. |

### DecisionRule

| Field | Type | Description |
| --- | --- | --- |
| `when` | `boolean` | Condition evaluated at render time. |
| `then` | `ReactElement` | Element to render when this rule matches. |
| `label` | `string` | Optional display label for the rule. |

## Basic usage -- first-match

Route a support ticket by severity. First matching rule wins.

```tsx
const triage = ctx.output(outputs.triage, { nodeId: "triage" });

<Workflow name="ticket-router">
  <Sequence>
    <Task id="triage" output={outputs.triage} agent={triageAgent}>
      Classify this ticket by severity.
    </Task>

    <DecisionTable
      rules={[
        {
          when: triage.severity === "critical",
          label: "Critical path",
          then: (
            <Task id="page-oncall" output={outputs.page} agent={pagerAgent}>
              Page the on-call engineer immediately.
            </Task>
          ),
        },
        {
          when: triage.severity === "high",
          label: "High priority",
          then: (
            <Task id="assign-senior" output={outputs.assign}>
              {{ assignee: "senior-pool", priority: "high" }}
            </Task>
          ),
        },
        {
          when: triage.severity === "low",
          label: "Low priority",
          then: (
            <Task id="add-backlog" output={outputs.backlog}>
              {{ queued: true }}
            </Task>
          ),
        },
      ]}
      default={
        <Task id="default-assign" output={outputs.assign}>
          {{ assignee: "general-pool", priority: "medium" }}
        </Task>
      }
    />
  </Sequence>
</Workflow>
```

## All-match strategy

Run all applicable compliance checks in parallel.

```tsx
<DecisionTable
  id="compliance"
  strategy="all-match"
  rules={[
    {
      when: ctx.input.region === "EU",
      label: "GDPR check",
      then: (
        <Task id="gdpr" output={outputs.gdpr} agent={gdprAgent}>
          Verify GDPR compliance.
        </Task>
      ),
    },
    {
      when: ctx.input.hasPII,
      label: "PII scan",
      then: (
        <Task id="pii-scan" output={outputs.pii} agent={piiAgent}>
          Scan for unprotected PII.
        </Task>
      ),
    },
    {
      when: ctx.input.amount > 50_000,
      label: "Financial audit",
      then: (
        <Task id="audit" output={outputs.audit} agent={auditAgent}>
          Run financial audit checks.
        </Task>
      ),
    },
  ]}
  default={
    <Task id="no-checks" output={outputs.noChecks}>
      {{ passed: true, note: "No compliance checks required" }}
    </Task>
  }
/>
```

## With skipIf

```tsx
<DecisionTable
  skipIf={ctx.input.dryRun}
  rules={[
    {
      when: testsPass,
      then: <Task id="deploy" output={outputs.deploy}>{{ ok: true }}</Task>,
    },
  ]}
  default={
    <Task id="skip-deploy" output={outputs.deploy}>{{ ok: false }}</Task>
  }
/>
```

## How it works

**`"first-match"`** builds nested `<Branch>` elements from the rules array. The first rule whose `when` is `true` renders its `then` element. If no rules match, the `default` element renders (or `null` if no default).

```
Branch(rule[0].when)
  then: rule[0].then
  else: Branch(rule[1].when)
    then: rule[1].then
    else: Branch(rule[2].when)
      then: rule[2].then
      else: default
```

**`"all-match"`** collects every rule where `when` is `true` and wraps their `then` elements in a `<Parallel>`. If no rules match, the `default` renders.

## Notes

- All `when` conditions are evaluated at render time, just like `<Branch>`.
- For `"first-match"`, rule order matters -- put higher-priority rules first.
- For `"all-match"`, all matching rules execute concurrently with no ordering guarantee.
- Each `then` element accepts any workflow element: a single `<Task>`, a `<Sequence>`, a `<Parallel>`, or another composite.
- Conditions are re-evaluated each render frame, enabling data-dependent routing.

---

## <DriftDetector>

> Composite component that captures state, compares it to a baseline, and conditionally alerts when meaningful drift is detected.
> Source: https://smithers.sh/components/drift-detector

```tsx
import { DriftDetector } from "smithers-orchestrator";
```

## Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `id` | `string` | `"drift"` | ID prefix for generated task ids (`{id}-capture`, `{id}-compare`). |
| `captureAgent` | `AgentLike` | **(required)** | Agent that captures the current state snapshot. |
| `compareAgent` | `AgentLike` | **(required)** | Agent that compares current state against the baseline. |
| `captureOutput` | `OutputTarget` | **(required)** | Output schema for the captured state. |
| `compareOutput` | `OutputTarget` | **(required)** | Output schema for comparison. Should include `drifted: boolean` and `significance: string`. |
| `baseline` | `unknown` | **(required)** | Static baseline data (object, string, etc.) to compare against. |
| `alertIf` | `(comparison) => boolean` | `undefined` | Custom condition for firing the alert. If omitted, the `drifted` field from the comparison output is used. |
| `alert` | `ReactElement` | `undefined` | Element to render when drift is detected (e.g. a `<Task>` that sends a notification). |
| `poll` | `{ intervalMs: number, maxPolls?: number }` | `undefined` | If set, wraps the detector in a `<Loop>` for periodic polling. `maxPolls` defaults to `100` when `poll` is provided but `maxPolls` is omitted. |
| `skipIf` | `boolean` | `false` | Skip the entire component. Returns `null`. |

## What it builds

`<DriftDetector>` composes primitives into the following tree:

```
Sequence
  ├─ Task (capture current state)
  ├─ Task (compare against baseline)
  └─ Branch (if drifted → alert element)
```

When `poll` is provided, the entire `Sequence` is wrapped in a `Loop`.

## Basic usage

```tsx
import { DriftDetector, Task, Workflow } from "smithers-orchestrator";
import { z } from "zod";

const captureSchema = z.object({
  endpoints: z.array(z.string()),
  schemaHash: z.string(),
});

const compareSchema = z.object({
  drifted: z.boolean(),
  significance: z.string(),
  changes: z.array(z.string()),
});

<Workflow name="api-drift-check">
  <DriftDetector
    captureAgent={snapshotAgent}
    compareAgent={diffAgent}
    captureOutput={outputs.capture}
    compareOutput={outputs.compare}
    baseline={{ endpoints: ["/users", "/orders"], schemaHash: "abc123" }}
    alert={
      <Task id="notify" output={outputs.notify} agent={slackAgent}>
        API drift detected — notify the team.
      </Task>
    }
  />
</Workflow>
```

## Poll mode

Poll periodically to detect drift over time:

```tsx
<DriftDetector
  id="config-drift"
  captureAgent={configReader}
  compareAgent={configDiffer}
  captureOutput={outputs.configSnapshot}
  compareOutput={outputs.configDiff}
  baseline={knownGoodConfig}
  poll={{ intervalMs: 60_000, maxPolls: 24 }}
  alert={
    <Task id="alert" output={outputs.alert} agent={pagerAgent}>
      Configuration drift detected — page on-call.
    </Task>
  }
/>
```

This runs every 60 seconds, up to 24 times.

## Custom alert condition

Use `alertIf` to override the default `drifted` check:

```tsx
<DriftDetector
  captureAgent={snapshotAgent}
  compareAgent={diffAgent}
  captureOutput={outputs.capture}
  compareOutput={outputs.compare}
  baseline={previousRelease}
  alertIf={(comparison) => comparison.significance === "breaking"}
  alert={
    <Task id="block-deploy" output={outputs.block}>
      Block the deployment — breaking changes detected.
    </Task>
  }
/>
```

## Generated task ids

With the default `id` prefix of `"drift"`:

| Task | ID |
| --- | --- |
| Capture | `drift-capture` |
| Compare | `drift-compare` |
| Poll loop | `drift-poll` |

Override with the `id` prop:

```tsx
<DriftDetector id="schema" ... />
// → schema-capture, schema-compare, schema-poll
```

## Notes

- `<DriftDetector>` is a composite component. It renders a tree of `<Sequence>`, `<Task>`, `<Branch>`, and optionally `<Loop>`.
- The `compareOutput` schema should include `drifted: boolean` so the default alert condition works. If you use `alertIf`, any schema shape is fine.
- Without `alert`, the component captures and compares but takes no action on drift.
- Without `poll`, the component runs once. Use `poll` for continuous monitoring.

---

## <ScanFixVerify>

> Composite component that scans for problems, fixes them in parallel, verifies the fixes, and produces a report.
> Source: https://smithers.sh/components/scan-fix-verify

```tsx
import { ScanFixVerify } from "smithers-orchestrator";
```

## Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `id` | `string` | `"sfv"` | ID prefix for generated task ids. |
| `scanner` | `AgentLike` | **(required)** | Agent that scans for problems and returns an issues array. |
| `fixer` | `AgentLike \| AgentLike[]` | **(required)** | Agent(s) that fix discovered problems. When an array, agents are cycled across issues. |
| `verifier` | `AgentLike` | **(required)** | Agent that verifies fixes were applied correctly. |
| `scanOutput` | `OutputTarget` | **(required)** | Output schema for scan results. Should include `issues: Array`. |
| `fixOutput` | `OutputTarget` | **(required)** | Output schema for each fix result. |
| `verifyOutput` | `OutputTarget` | **(required)** | Output schema for verification. |
| `reportOutput` | `OutputTarget` | **(required)** | Output schema for the final summary report. |
| `maxConcurrency` | `number` | `Infinity` | Maximum parallel fix tasks. |
| `maxRetries` | `number` | `3` | Maximum scan-fix-verify cycles before stopping. |
| `skipIf` | `boolean` | `false` | Skip the entire component. Returns `null`. |
| `children` | `ReactNode` | `undefined` | Prompt/context describing what to scan for. Passed to the scanner task. |

## What it builds

`<ScanFixVerify>` composes primitives into the following tree:

```
Sequence
  ├─ Loop (up to maxRetries)
  │   └─ Sequence
  │       ├─ Task (scan for problems)
  │       ├─ Parallel (fix all issues)
  │       │   └─ Task (fix)
  │       └─ Task (verify fixes)
  └─ Task (final report)
```

The loop continues until the verifier confirms all issues are resolved or `maxRetries` is reached.

## Basic usage

```tsx
import { ScanFixVerify, Workflow } from "smithers-orchestrator";
import { z } from "zod";

const scanSchema = z.object({
  issues: z.array(z.object({
    file: z.string(),
    line: z.number(),
    message: z.string(),
  })),
});

const fixSchema = z.object({
  file: z.string(),
  applied: z.boolean(),
  description: z.string(),
});

const verifySchema = z.object({
  allResolved: z.boolean(),
  remainingIssues: z.number(),
});

const reportSchema = z.object({
  totalCycles: z.number(),
  issuesFound: z.number(),
  issuesFixed: z.number(),
  summary: z.string(),
});

<Workflow name="lint-fix">
  <ScanFixVerify
    scanner={lintAgent}
    fixer={fixerAgent}
    verifier={verifyAgent}
    scanOutput={outputs.scan}
    fixOutput={outputs.fix}
    verifyOutput={outputs.verify}
    reportOutput={outputs.report}
    maxRetries={5}
    maxConcurrency={3}
  >
    Scan the codebase for linting errors and type issues.
  </ScanFixVerify>
</Workflow>
```

## Multiple fixer agents

Pass an array of agents to cycle different specialists across issues:

```tsx
<ScanFixVerify
  scanner={securityScanner}
  fixer={[dependencyFixer, codeFixer, configFixer]}
  verifier={securityVerifier}
  scanOutput={outputs.scan}
  fixOutput={outputs.fix}
  verifyOutput={outputs.verify}
  reportOutput={outputs.report}
>
  Scan for security vulnerabilities in dependencies, code, and configuration.
</ScanFixVerify>
```

## Generated task ids

With the default `id` prefix of `"sfv"`:

| Task | ID |
| --- | --- |
| Scan | `sfv-scan` |
| Fix | `sfv-fix` |
| Fix parallel group | `sfv-fixes` |
| Verify | `sfv-verify` |
| Loop | `sfv-loop` |
| Report | `sfv-report` |

Override with the `id` prop:

```tsx
<ScanFixVerify id="security" ... />
// → security-scan, security-fix, security-verify, etc.
```

## Notes

- `<ScanFixVerify>` is a composite component. It renders a tree of `<Loop>`, `<Sequence>`, `<Parallel>`, and `<Task>`.
- The `scanOutput` schema should include an `issues` array. The fixer receives context about what to fix from the scan results.
- The loop's `until` condition is driven by the verifier output. When the verifier reports all clear, the loop exits.
- The final report task runs after the loop completes, summarizing all cycles.
- If `maxRetries` is reached without full resolution, the report still runs with the last known state.

---

## <Poller>

> Composite component that polls an external condition with configurable backoff until satisfied or timed out.
> Source: https://smithers.sh/components/poller

```tsx
import { Poller } from "smithers-orchestrator";
```

## Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `id` | `string` | `"poll"` | ID prefix for generated task ids. |
| `check` | `AgentLike \| Function` | **(required)** | Agent or compute function that checks the condition. |
| `checkOutput` | `OutputTarget` | **(required)** | Output schema for the check result. Must include `satisfied: boolean`. |
| `maxAttempts` | `number` | `30` | Maximum poll attempts before stopping. |
| `backoff` | `"fixed" \| "linear" \| "exponential"` | `"fixed"` | Backoff strategy between polls. |
| `intervalMs` | `number` | `5000` | Base interval in milliseconds between polls. |
| `onTimeout` | `"fail" \| "return-last"` | `"fail"` | Behavior when `maxAttempts` is exhausted. `"fail"` fails the workflow; `"return-last"` keeps the final result and continues. |
| `skipIf` | `boolean` | `false` | Skip the entire component. Returns `null`. |
| `children` | `ReactNode` | `undefined` | Prompt/condition description for the check agent. |

## What it builds

`<Poller>` composes primitives into the following tree:

```
Loop (until satisfied, maxIterations = maxAttempts)
  └─ Task (check condition, timeoutMs based on backoff)
```

Each iteration runs the check task. The loop exits when `satisfied` is `true` or `maxAttempts` is reached.

## Backoff strategies

| Strategy | Interval at attempt N | Example (base 5s) |
| --- | --- | --- |
| `"fixed"` | `intervalMs` | 5s, 5s, 5s, 5s |
| `"linear"` | `intervalMs * (N + 1)` | 5s, 10s, 15s, 20s |
| `"exponential"` | `intervalMs * 2^N` | 5s, 10s, 20s, 40s |

## Basic usage

```tsx
import { Poller, Workflow } from "smithers-orchestrator";
import { z } from "zod";

const checkSchema = z.object({
  satisfied: z.boolean(),
  status: z.string(),
  details: z.string(),
});

<Workflow name="wait-for-deploy">
  <Poller
    check={statusChecker}
    checkOutput={outputs.check}
    maxAttempts={20}
    intervalMs={10_000}
    backoff="exponential"
    onTimeout="fail"
  >
    Check whether the deployment to production has completed successfully.
  </Poller>
</Workflow>
```

## With a compute function

Use a plain function instead of an agent for simple HTTP checks:

```tsx
<Poller
  check={async () => {
    const res = await fetch("https://api.example.com/health");
    const data = await res.json();
    return { satisfied: data.status === "healthy", status: data.status };
  }}
  checkOutput={outputs.healthCheck}
  maxAttempts={10}
  intervalMs={3000}
  backoff="linear"
  onTimeout="return-last"
>
  Wait for the API health check to return healthy.
</Poller>
```

## Timeout behavior

| `onTimeout` | What happens at `maxAttempts` |
| --- | --- |
| `"fail"` | Workflow fails with a max-iteration error. |
| `"return-last"` | Keeps the last check result and continues the workflow. |

## Generated task ids

With the default `id` prefix of `"poll"`:

| Task | ID |
| --- | --- |
| Check | `poll-check` |
| Loop | `poll-loop` |

Override with the `id` prop:

```tsx
<Poller id="deploy-status" ... />
// → deploy-status-check, deploy-status-loop
```

## Notes

- `<Poller>` is a composite component. It renders a `<Loop>` containing a single `<Task>`.
- The `checkOutput` schema must include a `satisfied: boolean` field. The loop's `until` condition checks this field.
- Use `<Poller>` as a pull-based fallback when webhooks or push notifications are not available.
- The `timeoutMs` on the check task is derived from the backoff strategy, giving each poll attempt time proportional to the interval.
- For simple cases, the `gate.tsx` example shows a similar pattern built manually with `<Loop>` and `<Task>`.

---

## <Supervisor>

> Composite component that orchestrates a boss agent planning, delegating to parallel workers, reviewing results, and re-delegating failures.
> Source: https://smithers.sh/components/supervisor

A higher-level component that composes `Sequence`, `Task`, `Parallel`, `Loop`, and optionally `Worktree` into a full supervisor workflow. The boss agent plans work, workers execute in parallel, the boss reviews, and failures are re-delegated -- all in a single declarative element.

## Import

```tsx
import { Supervisor } from "smithers-orchestrator";
```

## Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `id` | `string` | `"supervisor"` | ID prefix for all generated nodes. |
| `boss` | `AgentLike` | **(required)** | Agent that plans, delegates, and reviews. |
| `workers` | `Record<string, AgentLike>` | **(required)** | Map of worker type names to agents (e.g., `{ coder, tester, docs }`). |
| `planOutput` | `OutputTarget` | **(required)** | Output schema for the boss's plan. Should include `tasks: Array<{ id, workerType, instructions }>`. |
| `workerOutput` | `OutputTarget` | **(required)** | Output schema for individual worker results. |
| `reviewOutput` | `OutputTarget` | **(required)** | Output schema for the boss's review. Should include `allDone: boolean` and `retriable: string[]`. |
| `finalOutput` | `OutputTarget` | **(required)** | Output schema for the final summary. |
| `maxIterations` | `number` | `3` | Max delegate-review cycles before stopping. |
| `maxConcurrency` | `number` | `5` | Max parallel workers per cycle. |
| `useWorktrees` | `boolean` | `false` | Whether each worker gets its own git worktree for isolation. |
| `skipIf` | `boolean` | `false` | Skip the entire supervisor workflow. |
| `children` | `string \| ReactNode` | **(required)** | Goal/prompt passed to the boss agent for planning. |

## Generated structure

`<Supervisor>` expands to:

```
Sequence
  ├── Task (boss plan)          id: "{prefix}-plan"
  ├── Loop (until allDone)      id: "{prefix}-loop"
  │   └── Sequence
  │       ├── Parallel
  │       │   ├── [Worktree?] → Task (worker A)   id: "{prefix}-worker-{type}"
  │       │   ├── [Worktree?] → Task (worker B)
  │       │   └── ...
  │       └── Task (boss review)                   id: "{prefix}-review"
  └── Task (final summary)     id: "{prefix}-final"
```

## Basic usage

```tsx
import { Workflow, Supervisor, createSmithers } from "smithers-orchestrator";
import { ToolLoopAgent as Agent } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
import { z } from "zod";

const planSchema = z.object({
  tasks: z.array(z.object({
    id: z.string(),
    workerType: z.enum(["coder", "tester"]),
    instructions: z.string(),
  })),
  strategy: z.string(),
});

const workerResultSchema = z.object({
  taskId: z.string(),
  status: z.enum(["success", "partial", "failed"]),
  summary: z.string(),
});

const reviewSchema = z.object({
  allDone: z.boolean(),
  retriable: z.array(z.string()),
  summary: z.string(),
});

const finalSchema = z.object({
  totalTasks: z.number(),
  succeeded: z.number(),
  summary: z.string(),
});

const { Workflow, smithers, outputs } = createSmithers({
  plan: planSchema,
  workerResult: workerResultSchema,
  review: reviewSchema,
  final: finalSchema,
});

const boss = new Agent({
  model: anthropic("claude-sonnet-4-20250514"),
  instructions: "You are a tech lead. Break goals into tasks and assign them.",
});

const coder = new Agent({
  model: anthropic("claude-sonnet-4-20250514"),
  instructions: "You are a developer. Implement assigned tasks.",
});

const tester = new Agent({
  model: anthropic("claude-sonnet-4-20250514"),
  instructions: "You are a test engineer. Write tests for assigned code.",
});

export default smithers(() => (
  <Workflow name="build-feature">
    <Supervisor
      boss={boss}
      workers={{ coder, tester }}
      planOutput={outputs.plan}
      workerOutput={outputs.workerResult}
      reviewOutput={outputs.review}
      finalOutput={outputs.final}
      maxIterations={3}
      maxConcurrency={4}
    >
      Build the user authentication module with tests.
    </Supervisor>
  </Workflow>
));
```

## With worktrees

Enable `useWorktrees` so each worker operates in an isolated git worktree:

```tsx
<Supervisor
  id="isolated-build"
  boss={boss}
  workers={{ coder, tester, docs: docsWriter }}
  planOutput={outputs.plan}
  workerOutput={outputs.workerResult}
  reviewOutput={outputs.review}
  finalOutput={outputs.final}
  useWorktrees
  maxConcurrency={3}
>
  Refactor the payments module and update docs.
</Supervisor>
```

Each worker Task is wrapped in a `<Worktree>` at `.worktrees/{prefix}-worker-{type}` with branch `worker/{prefix}-worker-{type}`.

## Node IDs

All generated node IDs are prefixed with the `id` prop (default `"supervisor"`):

| Node | ID |
| --- | --- |
| Plan | `{id}-plan` |
| Loop | `{id}-loop` |
| Worker (per type) | `{id}-worker-{workerType}` |
| Review | `{id}-review` |
| Final | `{id}-final` |

Use these IDs with `ctx.outputMaybe()` or `needs` to reference supervisor outputs from other parts of your workflow.

## Notes

- The `until` condition on the Loop is evaluated reactively by the runtime. The boss review output's `allDone` field controls termination.
- Workers run with `continueOnFail` so a single worker failure does not abort the cycle.
- The final summary Task depends on both the plan and review outputs via `needs`.
- Combine with `<Sequence>` to chain a Supervisor with other workflow steps.

---

## <Runbook>

> Composite component that executes sequential steps with risk classification, auto-executing safe steps and gating risky ones with approval.
> Source: https://smithers.sh/components/runbook

A higher-level component that composes `Sequence`, `Task`, and `Approval` into a runbook workflow. Safe steps auto-execute. Risky and critical steps pause for human approval before proceeding.

## Import

```tsx
import { Runbook } from "smithers-orchestrator";
```

## Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `id` | `string` | `"runbook"` | ID prefix for all generated nodes. |
| `steps` | `RunbookStep[]` | **(required)** | Ordered steps to execute. See step shape below. |
| `defaultAgent` | `AgentLike` | `undefined` | Default agent for steps that don't specify one. |
| `stepOutput` | `OutputTarget` | **(required)** | Default output schema for step results. |
| `approvalRequest` | `Partial<ApprovalRequest>` | `undefined` | Template for approval requests on risky/critical steps. |
| `onDeny` | `"fail" \| "skip"` | `"fail"` | Behavior when a risky/critical step is denied. |
| `skipIf` | `boolean` | `false` | Skip the entire runbook. |

### RunbookStep

```ts
type RunbookStep = {
  id: string;                              // Unique step identifier
  agent?: AgentLike;                       // Per-step agent override
  command?: string;                        // Shell command or instruction
  risk: "safe" | "risky" | "critical";     // Risk classification
  label?: string;                          // Human-readable label
  output?: OutputTarget;                   // Per-step output override
};
```

## Generated structure

Each step expands differently based on risk:

```
Sequence
  ├── Task (safe step)              id: "{prefix}-{step.id}"
  ├── Approval (risky gate)         id: "{prefix}-{step.id}-approval"
  ├── Task (risky step)             id: "{prefix}-{step.id}"
  ├── Approval (critical gate)      id: "{prefix}-{step.id}-approval"    meta: { elevated: true }
  └── Task (critical step)          id: "{prefix}-{step.id}"
```

Steps are chained sequentially: each step's `needs` references the previous step so execution order is guaranteed.

## Basic usage

```tsx
import { Workflow, Runbook, createSmithers } from "smithers-orchestrator";
import { ToolLoopAgent as Agent } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
import { z } from "zod";

const stepResultSchema = z.object({
  stepId: z.string(),
  success: z.boolean(),
  output: z.string(),
});

const { Workflow, smithers, outputs } = createSmithers({
  stepResult: stepResultSchema,
});

const ops = new Agent({
  model: anthropic("claude-sonnet-4-20250514"),
  instructions: "You are an ops engineer. Execute runbook steps carefully.",
});

export default smithers(() => (
  <Workflow name="deploy-runbook">
    <Runbook
      defaultAgent={ops}
      stepOutput={outputs.stepResult}
      steps={[
        { id: "health-check", command: "curl -f https://api.example.com/health", risk: "safe", label: "Health check" },
        { id: "backup-db", command: "pg_dump prod > backup.sql", risk: "risky", label: "Backup database" },
        { id: "run-migration", command: "npx prisma migrate deploy", risk: "critical", label: "Run migration" },
        { id: "smoke-test", command: "npm run test:smoke", risk: "safe", label: "Smoke tests" },
      ]}
    />
  </Workflow>
));
```

In this example:
- **health-check** and **smoke-test** auto-execute (safe).
- **backup-db** pauses for approval (risky).
- **run-migration** pauses for elevated approval (critical, with `elevated: true` metadata).

## Approval customization

Override the default approval prompt with `approvalRequest`:

```tsx
<Runbook
  defaultAgent={ops}
  stepOutput={outputs.stepResult}
  approvalRequest={{
    title: "Production change requires approval",
    summary: "An operator must approve before this step runs.",
    metadata: { team: "platform", environment: "production" },
  }}
  steps={[
    { id: "scale-down", command: "kubectl scale deployment/api --replicas=0", risk: "risky" },
    { id: "deploy", command: "kubectl apply -f deploy.yaml", risk: "critical" },
    { id: "scale-up", command: "kubectl scale deployment/api --replicas=3", risk: "safe" },
  ]}
/>
```

## Deny behavior

Control what happens when approval is denied:

| `onDeny` | Behavior |
| --- | --- |
| `"fail"` | Workflow fails immediately (default). |
| `"skip"` | The denied step is skipped; subsequent steps continue. |

```tsx
<Runbook
  defaultAgent={ops}
  stepOutput={outputs.stepResult}
  onDeny="skip"
  steps={[
    { id: "optional-cleanup", command: "rm -rf /tmp/cache", risk: "risky" },
    { id: "verify", command: "npm test", risk: "safe" },
  ]}
/>
```

## Per-step agents and outputs

Override the agent or output schema for individual steps:

```tsx
<Runbook
  defaultAgent={ops}
  stepOutput={outputs.stepResult}
  steps={[
    { id: "analyze", agent: analyst, output: outputs.analysis, command: "Analyze system metrics", risk: "safe" },
    { id: "remediate", command: "systemctl restart api", risk: "critical" },
  ]}
/>
```

## Node IDs

| Node | ID |
| --- | --- |
| Safe step | `{id}-{step.id}` |
| Risky/critical approval | `{id}-{step.id}-approval` |
| Risky/critical step | `{id}-{step.id}` |

## Notes

- Steps execute in declaration order. Each step depends on the previous via `needs`.
- Critical steps include `elevated: true` in the approval metadata, which can be used by custom approval UIs to require stronger authorization.
- The approval output for each gated step is stored at `{prefix}-{step.id}-approval-decision`.
- Combine with `<Sequence>` to place a Runbook alongside other workflow steps.
- If no `agent` is provided on a step and no `defaultAgent` is set, the Task renders as a static or compute node.

---

## <Subflow>

> Invoke a child workflow with its own retry, cache, and resume boundary.
> Source: https://smithers.sh/components/subflow

Composes workflows from workflows. A `<Subflow>` invokes a child workflow definition either as an independent run (`childRun` mode) or embedded inline in the parent (`inline` mode).

## Import

```tsx
import { Subflow } from "smithers-orchestrator";
```

## Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `id` | `string` | **(required)** | Unique node id within the workflow. |
| `workflow` | `Function` | **(required)** | The child workflow definition (a smithers workflow function). |
| `input` | `unknown` | `undefined` | Input to pass to the child workflow. |
| `mode` | `"childRun" \| "inline"` | `"childRun"` | `childRun` creates its own DB row/run; `inline` embeds in parent tree. |
| `output` | `z.ZodObject \| Table \| string` | **(required)** | Where to store the subflow result. |
| `skipIf` | `boolean` | `false` | Skip this node entirely. |
| `timeoutMs` | `number` | `undefined` | Max execution time in ms. |
| `retries` | `number` | `0` | Retry attempts before failure. |
| `retryPolicy` | `RetryPolicy` | `undefined` | `{ backoff?: "fixed" \| "linear" \| "exponential", initialDelayMs?: number }` |
| `continueOnFail` | `boolean` | `false` | Workflow continues even if this node fails. |
| `cache` | `CachePolicy` | `undefined` | `{ by?: (ctx) => unknown, version?: string }`. Skip re-execution on cache hit. |
| `dependsOn` | `string[]` | `undefined` | Task IDs that must complete first. |
| `needs` | `Record<string, string>` | `undefined` | Named deps. Keys become context keys, values are task IDs. |
| `label` | `string` | `id` | Display label override. |
| `meta` | `Record<string, unknown>` | `undefined` | Extra metadata. |

## childRun Mode (default)

The child workflow gets its own database row and run boundary. Retries, caching, and resume all apply to the child run as a unit.

```tsx
import { Workflow, Sequence, Task, Subflow, createSmithers } from "smithers-orchestrator";
import { z } from "zod";
import childWorkflow from "./child-workflow";

const { smithers, outputs } = createSmithers({
  childResult: z.object({ status: z.string() }),
  finalResult: z.object({ summary: z.string() }),
});

export default smithers((ctx) => (
  <Workflow name="parent-flow">
    <Sequence>
      <Subflow
        id="run-child"
        workflow={childWorkflow}
        input={{ repo: "acme/app" }}
        output={outputs.childResult}
        retries={2}
        timeoutMs={300_000}
      />

      <Task id="summarize" output={outputs.finalResult} agent={summarizer}>
        Summarize the child workflow result.
      </Task>
    </Sequence>
  </Workflow>
));
```

## inline Mode

The child workflow tree is rendered directly inside the parent. No separate DB row is created -- the child's tasks appear as siblings in the parent plan.

```tsx
<Subflow
  id="inline-checks"
  workflow={checksWorkflow}
  input={{ strict: true }}
  output={outputs.checksResult}
  mode="inline"
/>
```

## Conditional Skipping

```tsx
<Subflow
  id="optional-child"
  workflow={optionalWorkflow}
  output={outputs.optionalResult}
  skipIf={!ctx.input.runOptional}
/>
```

## Behavior

- In `childRun` mode, the subflow creates an independent run entry. The parent task waits for the child run to complete.
- In `inline` mode, the child workflow's component tree is rendered inline as a sequence in the parent plan tree.
- Standard retry, cache, and timeout semantics apply at the subflow boundary.
- The subflow result is written to the configured `output` target.

## Rendering

`<Subflow>` renders as a `smithers:subflow` host element. The scheduler treats `childRun` mode as a single task node and `inline` mode as a sequence of the child's tasks.

## Notes

- `childRun` is the default and recommended mode for isolation and resumability.
- Use `inline` when you want the child tasks to participate directly in the parent's plan and share its retry/resume scope.
- Subflows are composable -- a child workflow can itself contain `<Subflow>` nodes.

---

## <WaitForEvent>

> Durably suspend until a correlated external event or webhook arrives, or timeout.
> Source: https://smithers.sh/components/wait-for-event

Push-based complement to polling. A `<WaitForEvent>` durably suspends the workflow until a matching external event arrives (or the timeout expires). The event payload is written to the configured output.

## Import

```tsx
import { WaitForEvent } from "smithers-orchestrator";
```

## Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `id` | `string` | **(required)** | Unique node id within the workflow. |
| `event` | `string` | **(required)** | Event name/type to wait for (e.g. `"deploy.completed"`). |
| `correlationId` | `string` | `undefined` | Correlation key to match the right event instance. |
| `output` | `z.ZodObject \| Table \| string` | **(required)** | Where to store the event payload. |
| `outputSchema` | `z.ZodObject` | `undefined` | Zod schema for validating the event payload. |
| `timeoutMs` | `number` | `undefined` | Max wait time in ms before timeout behavior triggers. |
| `onTimeout` | `"fail" \| "skip" \| "continue"` | `"fail"` | What happens when the timeout expires. |
| `async` | `boolean` | `false` | When `true`, unrelated downstream flow can continue while the event is still pending. Explicit dependencies still wait for the payload. |
| `skipIf` | `boolean` | `false` | Skip this node entirely. |
| `dependsOn` | `string[]` | `undefined` | Task IDs that must complete first. |
| `needs` | `Record<string, string>` | `undefined` | Named deps. Keys become context keys, values are task IDs. |
| `label` | `string` | `wait:<event>` | Display label override. |
| `meta` | `Record<string, unknown>` | `undefined` | Extra metadata. |

## Basic Usage

```tsx
import {
  Workflow,
  Sequence,
  Task,
  WaitForEvent,
  createSmithers,
} from "smithers-orchestrator";
import { z } from "zod";

const deployPayload = z.object({
  environment: z.string(),
  sha: z.string(),
  status: z.enum(["success", "failure"]),
});

const { smithers, outputs } = createSmithers({
  deployEvent: deployPayload,
  summary: z.object({ message: z.string() }),
});

export default smithers((ctx) => (
  <Workflow name="deploy-watcher">
    <Sequence>
      <WaitForEvent
        id="wait-deploy"
        event="deploy.completed"
        correlationId={ctx.input.deployId}
        output={outputs.deployEvent}
        outputSchema={deployPayload}
        timeoutMs={600_000}
        onTimeout="fail"
      />

      <Task id="notify" output={outputs.summary} agent={notifier}>
        The deploy finished. Summarize the result.
      </Task>
    </Sequence>
  </Workflow>
));
```

## Timeout Behaviors

| `onTimeout` | Effect |
| --- | --- |
| `"fail"` | Node fails. Workflow stops (unless `continueOnFail` is set on a parent). |
| `"skip"` | Node is skipped. Downstream nodes that depend on it see `skipped` status. |
| `"continue"` | Node completes with a null/empty payload. Downstream nodes proceed. |

## Correlated Events

Use `correlationId` to match specific event instances when multiple events of the same type may arrive:

```tsx
<WaitForEvent
  id="wait-pr-merged"
  event="github.pull_request.merged"
  correlationId={`pr-${ctx.input.prNumber}`}
  output={outputs.mergeEvent}
  timeoutMs={86_400_000}
/>
```

## Durable Deferred Resolution

When `async` is `true`, the wait node becomes a **durable deferred**: the workflow records the subscription durably and allows unrelated downstream work to proceed immediately. The deferred resolves when the event arrives. Any task that explicitly depends on this node (via `dependsOn` or `needs`) still blocks until the payload is available.

This pattern is useful when you want to kick off a long-running external process and continue with independent work while waiting:

```tsx
<Workflow name="async-deploy">
  <Sequence>
    <Task id="trigger-build" output={outputs.trigger} agent={ciAgent}>
      Trigger a CI build and return the build ID.
    </Task>

    <WaitForEvent
      id="build-done"
      event="ci.build.completed"
      correlationId={ctx.outputMaybe(outputs.trigger, { nodeId: "trigger-build" })?.buildId}
      output={outputs.buildResult}
      timeoutMs={1_800_000}
      onTimeout="fail"
      async
    />

    {/* This runs immediately — it does not depend on the build result */}
    <Task id="notify-started" output={outputs.notifyStarted}>
      {{ message: "Build triggered, waiting for completion." }}
    </Task>

    {/* This blocks until the build event payload is available */}
    <Task id="deploy" output={outputs.deploy} dependsOn={["build-done"]} agent={deployAgent}>
      Deploy the completed build.
    </Task>
  </Sequence>
</Workflow>
```

The subscription survives worker restarts — the Temporal runtime checkpoints the event wait durably. When the matching event arrives, the deferred resolves and any blocked dependents resume.

## Behavior

- When the scheduler reaches this node, it enters `waiting-event` status.
- The engine durably records the event subscription (event name + correlation ID).
- When a matching event arrives (via webhook, API call, or event bus), the node resumes and writes the payload to `output`.
- If `timeoutMs` elapses first, the `onTimeout` behavior applies.
- With `async`, later unrelated nodes can continue before the event arrives. Use explicit `dependsOn` / `needs` when later work really requires the payload.

## Rendering

`<WaitForEvent>` renders as a `smithers:wait-for-event` host element. The scheduler treats it like a task node that enters `waiting-event` status instead of `in-progress`.

## Notes

- This is a push-based primitive. For poll-based external checks, use a `<Task>` with a compute function.
- The event payload is validated against `outputSchema` when provided.
- Combine with `<Sequence>` to gate downstream work on external events.
- Multiple `<WaitForEvent>` nodes can wait for different events in a `<Parallel>` block.
- Async event waits contribute to the Prometheus gauge `smithers_external_wait_async_pending{kind="event"}` while unresolved.

---

## <Signal>

> A typed wrapper around <WaitForEvent> for external signals keyed by node id.
> Source: https://smithers.sh/components/signal

`<Signal>` is the ergonomic form of [`<WaitForEvent>`](/components/wait-for-event) when the signal name should match the node id and the payload should be typed by a Zod schema.

## Import

```tsx
import { Signal } from "smithers-orchestrator";
```

## Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `id` | `string` | **(required)** | Signal name and node id. |
| `schema` | `z.ZodObject` | **(required)** | Typed payload schema and output target. |
| `correlationId` | `string` | `undefined` | Correlation key for matching a specific signal instance. |
| `timeoutMs` | `number` | `undefined` | Max wait time in ms. |
| `onTimeout` | `"fail" \| "skip" \| "continue"` | `"fail"` | Timeout behavior. |
| `async` | `boolean` | `false` | When `true`, unrelated downstream flow can continue while the signal is still pending. Explicit dependencies still wait for the payload. |
| `skipIf` | `boolean` | `false` | Skip this node entirely. |
| `dependsOn` | `string[]` | `undefined` | Task IDs that must complete first. |
| `needs` | `Record<string, string>` | `undefined` | Named deps. Keys become context keys, values are task IDs. |
| `label` | `string` | `signal:<id>` | Display label override. |
| `meta` | `Record<string, unknown>` | `undefined` | Extra metadata. |
| `children` | `(data) => ReactNode` | `undefined` | Optional typed render callback that mounts only after the signal payload exists. |

## Example

```tsx
import { Signal, Task, Workflow, createSmithers } from "smithers-orchestrator";
import { z } from "zod";

const { smithers, outputs } = createSmithers({
  feedback: z.object({
    rating: z.number(),
    comment: z.string(),
  }),
  summary: z.object({
    upper: z.string(),
  }),
});

export default smithers(() => (
  <Workflow name="signal-demo">
    <Signal id="user-feedback" schema={outputs.feedback} async>
      {(feedback) => (
        <Task id="summarize" output={outputs.summary}>
          {{ upper: feedback.comment.toUpperCase() }}
        </Task>
      )}
    </Signal>
  </Workflow>
));
```

## Behavior

- `<Signal>` renders a [`<WaitForEvent>`](/components/wait-for-event) internally with `event={id}` and `output={schema}`.
- Without `children`, it behaves like a plain typed [`<WaitForEvent>`](/components/wait-for-event).
- With `children`, the callback runs only after the payload has been received and validated.
- Async signal waits contribute to `smithers_external_wait_async_pending{kind="event"}` while unresolved.

---

## <Timer>

> Durably suspend a workflow for a relative duration or until an absolute point in time.
> Source: https://smithers.sh/components/timer

Think of `<Timer>` as a durable `sleep`. When the scheduler reaches this node, the [workflow suspends](/concepts/suspend-and-resume) — no polling, no busy-waiting. The Temporal runtime checkpoints the delay and resumes execution once the wall-clock condition is satisfied, even if the worker restarts in the meantime.

## Import

```tsx
import { Timer } from "smithers-orchestrator";
```

## Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `id` | `string` | **(required)** | Unique node id within the workflow. |
| `duration` | `string` | `undefined` | Relative delay as a human-readable string (e.g. `"500ms"`, `"30s"`, `"2h"`, `"7d"`). Exactly one of `duration` or `until` is required. |
| `until` | `string \| Date` | `undefined` | Absolute fire time as an ISO 8601 timestamp string or a `Date` object. Exactly one of `duration` or `until` is required. |
| `skipIf` | `boolean` | `false` | Skip this node entirely. The node resolves immediately with no delay. |
| `dependsOn` | `string[]` | `undefined` | Task IDs that must complete before the timer starts. |
| `needs` | `Record<string, string>` | `undefined` | Named deps. Keys become context keys, values are task IDs. |
| `label` | `string` | `timer:<id>` | Display label override. |
| `meta` | `Record<string, unknown>` | `undefined` | Extra metadata attached to the node record. |

> **Warning:** Exactly one of `duration` or `until` must be provided. Providing both, or neither, throws at render time.


> **Warning:** The `every` prop (recurring timer) is reserved for a future phase and is not supported. Passing it throws at render time.


## Duration strings

The `duration` prop accepts a concise human-readable format.

| String | Meaning |
| --- | --- |
| `"500ms"` | 500 milliseconds |
| `"1s"` / `"30s"` | 1 second / 30 seconds |
| `"5m"` / `"30m"` | 5 minutes / 30 minutes |
| `"1h"` / `"2h"` | 1 hour / 2 hours |
| `"1d"` / `"7d"` | 1 day / 7 days |

## Relative delay

Pause for 30 seconds before proceeding to the next step:

```tsx
import { Sequence, Task, Timer, Workflow, createSmithers } from "smithers-orchestrator";
import { z } from "zod";

const { smithers, outputs } = createSmithers({
  report: z.object({ summary: z.string() }),
});

export default smithers(() => (
  <Workflow name="delayed-report">
    <Sequence>
      <Timer id="cooldown" duration="30s" />
      <Task id="report" output={outputs.report} agent={reportAgent}>
        Generate the daily summary report.
      </Task>
    </Sequence>
  </Workflow>
));
```

## Absolute timestamp

Use `until` when the target time is computed at runtime — for example, a deadline stored in the workflow input:

```tsx
export default smithers((ctx) => (
  <Workflow name="scheduled-reminder">
    <Sequence>
      <Timer id="wait-until-deadline" until={ctx.input.reminderAt} />
      <Task id="send-reminder" output={outputs.notification} agent={notifierAgent}>
        Send the reminder to the user.
      </Task>
    </Sequence>
  </Workflow>
));
```

`ctx.input.reminderAt` can be an ISO string (`"2026-06-01T09:00:00Z"`) or a `Date` object — both are accepted. If the timestamp is already in the past when the node is reached, the timer fires immediately.

## Timer inside a [`<Loop>`](/components/loop)

Derive a duration from context at render time to implement a simple backoff:

```tsx
export default smithers((ctx) => {
  const delay = ctx.iteration === 0 ? "5m" : "30m";

  return (
    <Workflow name="retry-with-backoff">
      <Loop
        until={ctx.outputMaybe(outputs.result)?.success === true}
        maxIterations={5}
      >
        <Sequence>
          <Timer id="backoff" duration={delay} />
          <Task id="attempt" output={outputs.result} agent={workerAgent}>
            Attempt the operation.
          </Task>
        </Sequence>
      </Loop>
    </Workflow>
  );
});
```

## Behavior

- When the scheduler reaches a `<Timer>` node, it enters `waiting-timer` status.
- The engine records the timer target — a resolved UTC timestamp — durably in the workflow history.
- The worker thread releases the execution slot. No resources are held during the wait.
- When the target time arrives, Temporal wakes the workflow and the node transitions to `completed`. Downstream nodes are then eligible to run.
- If `skipIf` is `true`, the node resolves immediately without any delay.
- Worker restarts or redeployments during the wait do not reset the timer — the checkpoint is stored in the Temporal event history.
- Timers in separate [parallel](/components/parallel) branches wait independently.

## Rendering

`<Timer>` renders as a `smithers:timer` host element. The scheduler treats it as a leaf node that blocks the [sequence](/components/sequence) until the timer fires.

## Notes

- `<Timer>` produces no output. It has no `output` prop. It is a pure synchronization point.
- Use `dependsOn` or `needs` when the timer should start only after specific upstream tasks, rather than relying on sequence position alone.
- For event-driven delays (wait for an external signal rather than a fixed time), use [`<WaitForEvent>`](/components/wait-for-event) instead.
- Timers inside a `<Loop>` body reset each iteration because each iteration is a fresh render of the tree.

---

## <HumanTask>

> A task where the human is the agent -- enters JSON matching the output schema with validation retries.
> Source: https://smithers.sh/components/human-task

Like a [`<Task>`](/components/task) but the human is the agent. The [workflow suspends](/concepts/suspend-and-resume) until a human provides JSON input matching the output schema. If the input fails validation, the human gets up to `maxAttempts` retries (default 10).

## Import

```tsx
import { HumanTask } from "smithers-orchestrator";
```

## Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `id` | `string` | **(required)** | Unique node id within the workflow. |
| `output` | `z.ZodObject \| Table \| string` | **(required)** | Where to store the human's response. |
| `outputSchema` | `z.ZodObject` | `undefined` | Zod schema the human must conform to. Inferred from `output` when it is a Zod schema. |
| `prompt` | `string \| ReactNode` | **(required)** | Instructions shown to the human. |
| `maxAttempts` | `number` | `10` | Max validation retries before failure. |
| `async` | `boolean` | `false` | When `true`, unrelated downstream flow can continue while the human response is still pending. Explicit dependencies still wait for the validated output. |
| `skipIf` | `boolean` | `false` | Skip this node entirely. |
| `timeoutMs` | `number` | `undefined` | Max wait time in ms. |
| `continueOnFail` | `boolean` | `false` | Workflow continues even if this node fails. |
| `dependsOn` | `string[]` | `undefined` | Task IDs that must complete first. |
| `needs` | `Record<string, string>` | `undefined` | Named deps. Keys become context keys, values are task IDs. |
| `label` | `string` | `human:<id>` | Display label override. |
| `meta` | `Record<string, unknown>` | `undefined` | Extra metadata. |

## Schema-driven Example

```tsx
import {
  Workflow,
  Sequence,
  Task,
  HumanTask,
  createSmithers,
} from "smithers-orchestrator";
import { z } from "zod";

const reviewSchema = z.object({
  approved: z.boolean(),
  comments: z.string(),
  severity: z.enum(["low", "medium", "high"]),
});

const { smithers, outputs } = createSmithers({
  review: reviewSchema,
  summary: z.object({ status: z.string() }),
});

export default smithers((ctx) => {
  const review = ctx.outputMaybe(outputs.review, { nodeId: "human-review" });

  return (
    <Workflow name="review-flow">
      <Sequence>
        <HumanTask
          id="human-review"
          output={outputs.review}
          prompt="Please review the PR and provide your assessment. Fill in approved (boolean), comments (string), and severity (low/medium/high)."
          maxAttempts={5}
          timeoutMs={86_400_000}
        />

        {review ? (
          <Task id="record" output={outputs.summary}>
            {{ status: review.approved ? "approved" : "changes-requested" }}
          </Task>
        ) : null}
      </Sequence>
    </Workflow>
  );
});
```

## How It Works

1. The workflow reaches the `<HumanTask>` node and enters [`waiting-approval`](/concepts/approvals) status.
2. The human submits JSON input via `smithers approve` (the input goes in the `note` field).
3. The compute function parses and validates the JSON against the `outputSchema`.
4. If validation fails, the task retries -- the human is prompted again (up to `maxAttempts`).
5. On success, the validated data is written to the configured `output`.

## Submitting Input

Use the CLI to submit human input:

```bash
smithers approve <run-id> <node-id> --note '{"approved": true, "comments": "LGTM", "severity": "low"}'
```

## Validation Retries

When the human provides invalid JSON (wrong shape, missing fields, wrong types), the task fails validation and retries. The retry policy uses zero delay (`fixed` backoff, 0ms) so the human can immediately re-attempt.

```tsx
<HumanTask
  id="data-entry"
  output={outputs.formData}
  prompt="Enter the customer record as JSON with fields: name (string), email (string), tier (free|pro|enterprise)."
  maxAttempts={10}
/>
```

If the human cannot provide valid input within `maxAttempts`, the task fails.

## Prompt fallback

When the task meta is read back from the database (for example by a UI or the [CLI](/cli/overview)), the display prompt is resolved with a fallback chain:

1. The `prompt` prop value rendered to plain text at component creation time is stored in `meta.prompt`.
2. At display time, `getHumanTaskPrompt(meta, fallback)` returns `meta.prompt` if it is a non-empty string, otherwise it returns the provided `fallback` string.
3. If `prompt` is a React element (e.g. an MDX component), it is rendered to markdown before storage via `renderPromptToText`. The result is what humans see; no JSX or HTML tags reach the UI.

This means a `<HumanTask>` always has a stable text representation of its prompt regardless of whether the original JSX tree is still in scope.

## Request ID generation

Each `<HumanTask>` creates a human request record identified by a deterministic ID:

```
human:<runId>:<nodeId>:<iteration>
```

The ID is built by `buildHumanRequestId(runId, nodeId, iteration)` and is stable across retries within the same iteration. It is also used to link the human request record to the corresponding approval record: when a human submits input via `smithers approve`, the compute function looks up both records by this ID, prefers `humanRequest.responseJson` if present, and falls back to `approval.note` for backwards compatibility with approval-only submissions.

## Durable deferred resolution

`<HumanTask>` uses the same durable deferred mechanism as [`<Approval>`](/components/approval). When the node enters `waiting-approval` state, an `@effect/workflow DurableDeferred` is created and awaited by the executing task fiber. The deferred is keyed to the run, node, and iteration, so it survives process restarts: if the worker crashes while waiting, the next worker that picks up the task will re-await the same deferred and receive the resolution as soon as a human submits input.

When `smithers approve` is called, `bridgeApprovalResolve` resolves the deferred, which unblocks the awaiting fiber and lets the compute function proceed to read and validate the human input. No polling is needed.

## Behavior

- Internally creates a `smithers:task` host element with `needsApproval: true` and a compute function that reads human input from the database.
- Same approval flow as [`<Approval>`](/components/approval) -- the node suspends and waits for human input.
- With `async`, later unrelated nodes in the same sequence may continue rendering and executing before the human submits input.
- Schema validation happens at compute time, not at submission time.
- The `retries` prop is set to `maxAttempts - 1` (first attempt + retries = total attempts).

## `<HumanTask>` vs `<Approval>` vs `needsApproval`

| Use | When |
| --- | --- |
| `<HumanTask>` | Human provides structured data matching a schema. Validation + retries. |
| [`<Approval>`](/components/approval) | Human approves or denies. Decision persisted as `ApprovalDecision`. |
| `needsApproval` on [`<Task>`](/components/task) | Simple pause before an agent task. No separate value needed. |

## Notes

- The human's JSON input is stored in the approval `note` field as a JSON string.
- `outputSchema` is inferred from `output` when `output` is a Zod schema.
- Combine with [`<Sequence>`](/components/sequence) to gate downstream work on human input.
- Use `ctx.outputMaybe(...)` when rendering branches that consume an async human task's result.
- The `meta` field includes `humanTask: true`, `maxAttempts`, and the `prompt` for UI rendering.

---

## <Sandbox>

> Run a child workflow inside an isolated execution environment — Docker, Codeplane, or Bubblewrap — and collect its output bundle back into the parent workflow.
> Source: https://smithers.sh/components/sandbox

```tsx
import { Sandbox } from "smithers-orchestrator";
```

`<Sandbox>` spawns a child workflow inside an isolated runtime, ships a request bundle to it, waits for execution to finish, and collects the result bundle back into the parent workflow. Diffs produced inside the sandbox can be reviewed and optionally auto-accepted before they are applied to the host environment. Use `<Sandbox>` when a task needs a clean filesystem, network isolation, or a reproducible dependency environment that must not share state with the caller.

## Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `id` | `string` | required | Unique sandbox identifier within the workflow run. |
| `output` | `ZodObject \| DrizzleTable \| string` | required | Output target for the collected bundle result. |
| `workflow` | `(...args: any[]) => any` | `undefined` | Child workflow definition to execute inside the sandbox. |
| `input` | `unknown` | `undefined` | Input value passed to the child workflow. |
| `runtime` | `"bubblewrap" \| "docker" \| "codeplane"` | `"bubblewrap"` | Execution runtime. Falls back to `"bubblewrap"` if Docker is not available. |
| `allowNetwork` | `boolean` | `false` | Whether the sandbox has outbound network access. |
| `reviewDiffs` | `boolean` | `true` | Trigger the diff review event when the bundle contains patch files. |
| `autoAcceptDiffs` | `boolean` | `false` | Automatically accept diffs without requiring human approval. |
| `image` | `string` | `undefined` | Docker image to use for the `docker` runtime. |
| `env` | `Record<string, string>` | `undefined` | Environment variables injected into the container. |
| `ports` | `Array<{ host: number; container: number }>` | `undefined` | Port mappings for Docker containers. |
| `volumes` | `SandboxVolumeMount[]` | `undefined` | Volume mounts for Docker containers. |
| `memoryLimit` | `string` | `undefined` | Memory limit for the container (e.g. `"512m"`, `"2g"`). |
| `cpuLimit` | `string` | `undefined` | CPU limit for the container (e.g. `"0.5"`, `"2"`). |
| `command` | `string` | `undefined` | Override the default entrypoint command inside the sandbox. |
| `workspace` | `SandboxWorkspaceSpec` | `undefined` | Codeplane workspace configuration. |
| `skipIf` | `boolean` | `false` | Skip the sandbox entirely. Returns `null`. |
| `timeoutMs` | `number` | `undefined` | Total sandbox execution timeout in milliseconds. |
| `heartbeatTimeoutMs` | `number` | `undefined` | Heartbeat timeout in milliseconds. |
| `retries` | `number` | `undefined` | Number of retry attempts on failure. |
| `retryPolicy` | `RetryPolicy` | `undefined` | Retry policy configuration. |
| `continueOnFail` | `boolean` | `false` | Continue workflow execution even if the sandbox fails. |
| `cache` | `CachePolicy` | `undefined` | Cache policy for the sandbox result. |
| `dependsOn` | `string[]` | `undefined` | Explicit dependency IDs that must complete before this sandbox starts. |
| `needs` | `Record<string, string>` | `undefined` | Named output bindings from other steps. |
| `label` | `string` | `id` | Display label shown in the workflow UI. |
| `meta` | `Record<string, unknown>` | `undefined` | Arbitrary metadata attached to the sandbox event. |
| `children` | `ReactNode` | `undefined` | Child workflow body when using a `createSmithers()`-bound `Sandbox` wrapper. |

### SandboxVolumeMount

| Field | Type | Description |
| --- | --- | --- |
| `host` | `string` | Absolute path on the host machine. |
| `container` | `string` | Path inside the container. |
| `readonly` | `boolean` | Mount as read-only if `true`. |

### SandboxWorkspaceSpec

| Field | Type | Description |
| --- | --- | --- |
| `name` | `string` | Workspace name in the Codeplane account. |
| `snapshotId` | `string` | Snapshot ID to restore before execution. |
| `idleTimeoutSecs` | `number` | Seconds of inactivity before the workspace stops. |
| `persistence` | `"ephemeral" \| "sticky"` | Whether the workspace is discarded after each run (`"ephemeral"`) or kept between runs (`"sticky"`). |

## Basic usage with Docker

Run a code-generation workflow inside a Docker container with a specific image and resource limits:

```tsx
import { Sandbox } from "smithers-orchestrator";
import { z } from "zod";
import { generateCodeWorkflow } from "./workflows/generate-code";

const outputs = {
  result: z.object({ files: z.array(z.string()), summary: z.string() }),
};

<Workflow name="code-gen-sandbox">
  <Sandbox
    id="generate"
    workflow={generateCodeWorkflow}
    input={{ prompt: ctx.input.prompt, language: "typescript" }}
    output={outputs.result}
    runtime="docker"
    image="node:20-alpine"
    env={{ NODE_ENV: "production", LOG_LEVEL: "info" }}
    ports={[{ host: 3000, container: 3000 }]}
    memoryLimit="1g"
    cpuLimit="1"
    allowNetwork={false}
    reviewDiffs={true}
    autoAcceptDiffs={false}
    timeoutMs={300_000}
  />
</Workflow>
```

## Codeplane persistent workspace

Use a Codeplane workspace with a pre-built snapshot for faster startup and sticky persistence across runs:

```tsx
import { Sandbox } from "smithers-orchestrator";
import { z } from "zod";
import { testRunnerWorkflow } from "./workflows/test-runner";

const outputs = {
  testResult: z.object({
    passed: z.number(),
    failed: z.number(),
    coverage: z.number(),
  }),
};

<Workflow name="test-in-codeplane">
  <Sandbox
    id="run-tests"
    workflow={testRunnerWorkflow}
    input={{ branch: ctx.input.branch, suite: "integration" }}
    output={outputs.testResult}
    runtime="codeplane"
    workspace={{
      name: "test-runner",
      snapshotId: "snap_abc123",
      persistence: "sticky",
      idleTimeoutSecs: 300,
    }}
    allowNetwork={true}
    reviewDiffs={false}
    timeoutMs={600_000}
  />
</Workflow>
```

## With diff review and conditional auto-accept

Run a refactoring workflow that produces patches. Auto-accept only when the parent input explicitly approves:

```tsx
import { Sandbox } from "smithers-orchestrator";
import { z } from "zod";
import { refactorWorkflow } from "./workflows/refactor";

const outputs = {
  refactor: z.object({ summary: z.string(), patchCount: z.number() }),
};

<Workflow name="refactor-sandbox">
  <Sandbox
    id="refactor"
    workflow={refactorWorkflow}
    input={{ target: ctx.input.filepath, style: ctx.input.styleGuide }}
    output={outputs.refactor}
    runtime="docker"
    image="node:20-alpine"
    allowNetwork={false}
    reviewDiffs={true}
    autoAcceptDiffs={ctx.input.autoApprove === true}
    timeoutMs={180_000}
    retries={1}
    continueOnFail={false}
  />
</Workflow>
```

## Runtime comparison

| Feature | `bubblewrap` | `docker` | `codeplane` |
| --- | --- | --- | --- |
| Requires external daemon | No | Yes (Docker) | Yes (API credentials) |
| Custom image | No | Yes (`image`) | Workspace snapshot |
| Port mapping | No | Yes (`ports`) | No |
| Volume mounts | No | Yes (`volumes`) | No |
| Resource limits | No | Yes (`memoryLimit`, `cpuLimit`) | No |
| Environment variables | No | Yes (`env`) | No |
| Persistent workspace | No | No | Yes (`persistence: "sticky"`) |
| Snapshot restore | No | No | Yes (`snapshotId`) |
| Idle timeout | No | No | Yes (`idleTimeoutSecs`) |
| Auto-fallback target | — | `bubblewrap` | — |
| External credentials required | No | No | `CODEPLANE_API_URL`, `CODEPLANE_API_KEY` |

## How sandbox execution works

When the engine mounts a `<Sandbox>` node it follows this sequence:

1. Checks the active sandbox count against the concurrency limit. Fails immediately if the limit is reached.
2. Creates a `request-bundle` directory under `.smithers/sandboxes/<runId>/<sandboxId>/` and writes an initial `README.md` manifest with `status: "pending"`.
3. Calls the transport layer's `create` to provision the runtime environment (container, workspace, or local process).
4. Ships the request bundle to the sandbox via `ship`.
5. Executes `smithers up bundle.tsx` inside the sandbox.
6. Runs the child workflow as a detached child run.
7. Writes the child run's output and logs into a result bundle.
8. Calls `collect` on the transport to retrieve the result bundle path.
9. Validates the bundle: size, manifest structure, and patch path safety.
10. If the bundle contains patches and `reviewDiffs` is `true`, emits `SandboxDiffReviewRequested`. If `autoAcceptDiffs` is `false`, throws and leaves patches unapplied.
11. If `autoAcceptDiffs` is `true`, emits `SandboxDiffAccepted` and returns `manifest.outputs` to the parent workflow.
12. Always calls `cleanup` on the transport handle in a `finally` block, even on failure.

## Delta transport

The sandbox communicates with the host through a file-based delta transport. The host writes a request bundle — a directory containing a `README.md` JSON manifest — and the sandbox writes a result bundle back to a separate `result/` directory. The transport layer (`SandboxTransport`) abstracts the mechanics of moving those directories into and out of the runtime. Each transport operation is timed and reported to the `sandboxTransportDurationMs` metric.

The `SandboxTransportService` interface exposes five operations:

| Method | Description |
| --- | --- |
| `create(config)` | Provision the runtime and return a `SandboxHandle`. |
| `ship(bundlePath, handle)` | Copy the request bundle into the runtime. |
| `execute(command, handle)` | Run a command inside the runtime. |
| `collect(handle)` | Retrieve the result bundle from the runtime. |
| `cleanup(handle)` | Destroy or release the runtime environment. |

## Bundle structure and validation

Every result bundle must pass validation before the parent workflow receives its outputs.

```
<sandboxId>/
  README.md           — JSON manifest (required)
  patches/            — Unified diff files (.patch)
  artifacts/          — Arbitrary output files
  logs/
    stream.ndjson     — Streaming log capture (optional)
```

The `README.md` manifest is a JSON object with this shape:

```json
{
  "status": "finished",
  "runId": "run_abc123",
  "outputs": { "summary": "Done" },
  "patches": ["patches/change.patch"]
}
```

`status` must be one of `"finished"`, `"failed"`, or `"cancelled"`. Any other value causes validation to throw before the bundle is used.

### Bundle limits

| Limit | Value |
| --- | --- |
| Total bundle size | 100 MB |
| `README.md` size | 5 MB |
| Maximum patch files | 1,000 |
| Bundle path length | 1,024 characters |
| Run ID length | 256 characters |
| Output JSON depth | 16 levels |
| Output array length | 512 items |
| Output string length | 64 KB per string |

## Runtime auto-fallback

When `runtime="docker"` is set and the Docker daemon is not reachable at startup, `<Sandbox>` silently falls back to `"bubblewrap"`. The resolved runtime is recorded in the sandbox config and surfaced in the `SandboxCreated` event. No other runtime combination triggers automatic fallback.

## Concurrency limits

The maximum number of simultaneously active sandboxes within a single workflow run is controlled by the `SMITHERS_MAX_CONCURRENT_SANDBOXES` environment variable. It defaults to `10`. If the limit is reached when a new `<Sandbox>` node is mounted, the component throws immediately with `SANDBOX_EXECUTION_FAILED`.

```bash
SMITHERS_MAX_CONCURRENT_SANDBOXES=5 smithers up workflow.tsx
```

## Streaming log capture

If the child workflow produces a `logs/stream.ndjson` file during execution, that file is included in the result bundle and its path is available as `logsPath` in the validated bundle. Log capture does not contribute to the bundle size estimate until the bundle is written.

## Custom command override

Use `command` to replace the default `smithers up bundle.tsx` entrypoint:

```tsx
<Sandbox
  id="custom-run"
  workflow={myWorkflow}
  output={outputs.result}
  runtime="docker"
  image="node:20-alpine"
  command="node dist/runner.js"
/>
```

## Passing input to the sandbox

The `input` prop is serialized into the request bundle manifest and passed directly to the child workflow as its `input`. Any JSON-serializable value is valid:

```tsx
<Sandbox
  id="analyze"
  workflow={analyzeWorkflow}
  input={{
    repo: ctx.input.repo,
    ref: ctx.input.sha,
    checks: ["lint", "types", "tests"],
  }}
  output={outputs.analysis}
  runtime="bubblewrap"
/>
```

## Security notes

`<Sandbox>` enforces several controls to prevent unsafe bundles from affecting the host filesystem.

**Path traversal protection.** Every patch file path in the bundle manifest is resolved relative to `patches/` and checked with `path.relative`. Any path that resolves outside the bundle root (`..`) causes an immediate `TOOL_PATH_ESCAPE` error and the bundle is rejected before any files are applied.

**Patch file limit.** Bundles with more than 1,000 `.patch` files are rejected. This prevents resource exhaustion from unbounded file enumeration during bundle validation.

**README.md size limit.** The `README.md` manifest is capped at 5 MB. Oversized manifests are rejected before their JSON is parsed, preventing memory exhaustion from malformed bundles.

**Network isolation.** `allowNetwork` defaults to `false`. Each runtime enforces this constraint at the environment level, not in application code.

**Docker image pinning.** Specify an exact digest or a pinned tag in `image` to prevent image drift between runs. Untagged images pull `latest` which is non-deterministic.

**Codeplane credentials.** The `codeplane` runtime requires `CODEPLANE_API_URL` and `CODEPLANE_API_KEY` environment variables. If either is missing, the sandbox fails at `create` time with `INVALID_INPUT` rather than at execution time.

## Rendering

`<Sandbox>` renders to a `<smithers:sandbox>` host element. The child workflow definition is passed as the internal `__smithersSandboxWorkflow` attribute and the input as `__smithersSandboxInput`. These internal attributes are consumed by the engine and are not visible in the workflow tree. When `skipIf` is `true` the component returns `null` and no sandbox is provisioned.

## Notes

- A sandbox that fails during execution records `status: "failed"` in the local database and emits a `SandboxFailed` event. The error is re-thrown to the parent workflow unless `continueOnFail={true}`.
- `cleanup` is always called in a `finally` block. Cleanup errors are silently swallowed to avoid masking the original failure.
- `reviewDiffs` defaults to `true`. Set `autoAcceptDiffs={true}` to bypass the approval gate in automated pipelines.
- The `workspace.persistence` field only affects the Codeplane runtime. `"ephemeral"` workspaces are destroyed after each run; `"sticky"` workspaces are retained and reused on the next run with the same `workspace.name`.
- `snapshotId` restores a named Codeplane snapshot before execution begins, enabling fast environment setup without a full install step on every run.
- Steps declared in `dependsOn` must complete successfully before the sandbox is provisioned. The sandbox does not count toward the concurrency limit until provisioning begins.

---

## <ContinueAsNew>

> End the current run and start a fresh run with carried state, preventing unbounded workflow history growth.
> Source: https://smithers.sh/components/continue-as-new

Every Temporal workflow run accumulates an event history. For very long-running workflows — think a daemon that processes events indefinitely or a poller that runs for months — that history grows without bound, increasing replay time and memory pressure. `<ContinueAsNew>` solves this by closing the current run cleanly and immediately starting a fresh one, optionally carrying state across the boundary.

The new run begins with a clean history. From the outside it looks like the same workflow is still running. Inside, the state you passed arrives as `ctx.input.__smithersContinuation.payload`.

## Import

```tsx
import { ContinueAsNew, continueAsNew } from "smithers-orchestrator";
```

`continueAsNew(state?)` is a convenience helper that returns a `<ContinueAsNew>` element. Both forms are identical in behavior.

## Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `state` | `unknown` | `undefined` | Optional JSON-serializable payload carried into the next run as `ctx.input.__smithersContinuation.payload`. |

## Basic usage

Unconditionally hand off to a fresh run:

```tsx
import { ContinueAsNew, Workflow, createSmithers } from "smithers-orchestrator";

const { smithers } = createSmithers({});

export default smithers(() => (
  <Workflow name="always-continues">
    <ContinueAsNew />
  </Workflow>
));
```

This closes the run immediately and starts a new one with no carried state.

## Carrying state into the next run

Pass a JSON-serializable object via the `state` prop. The next run receives it at `ctx.input.__smithersContinuation.payload`:

```tsx
import { ContinueAsNew, Sequence, Task, Workflow, createSmithers } from "smithers-orchestrator";
import { z } from "zod";

const { smithers, outputs } = createSmithers({
  processed: z.object({ count: z.number(), lastCursor: z.string().nullable() }),
});

export default smithers((ctx) => {
  const continuation = (ctx.input as any)?.__smithersContinuation as
    | { payload?: { cursor?: string; count?: number } }
    | undefined;

  const cursor = continuation?.payload?.cursor ?? null;
  const count = continuation?.payload?.count ?? 0;

  return (
    <Workflow name="paginated-processor">
      <Sequence>
        <Task id="process-batch" output={outputs.processed} agent={processorAgent}>
          {`Process the next page. Cursor: ${cursor ?? "start"}. Total so far: ${count}.`}
        </Task>
        <ContinueAsNew
          state={{
            cursor: ctx.outputMaybe(outputs.processed, { nodeId: "process-batch" })?.lastCursor,
            count: count + 1,
          }}
        />
      </Sequence>
    </Workflow>
  );
});
```

## Conditional continuation

Most workflows only continue-as-new under certain conditions — for example, after a fixed number of iterations or when a sentinel value signals the end of input:

```tsx
export default smithers((ctx) => {
  const continuation = (ctx.input as any)?.__smithersContinuation as
    | { payload?: { cursor?: string } }
    | undefined;

  const nextCursor = ctx.outputMaybe(outputs.batch, { nodeId: "fetch" })?.nextCursor;
  const isDone = nextCursor == null;

  return (
    <Workflow name="cursor-drain">
      <Sequence>
        <Task id="fetch" output={outputs.batch} agent={fetcherAgent}>
          {`Fetch from cursor: ${continuation?.payload?.cursor ?? "start"}`}
        </Task>
        {isDone ? null : continueAsNew({ cursor: nextCursor })}
        {isDone ? (
          <Task id="finalize" output={outputs.summary} agent={summarizerAgent}>
            All pages processed. Write the final summary.
          </Task>
        ) : null}
      </Sequence>
    </Workflow>
  );
});
```

## Combined with Loop

A `<Loop>` is the right tool when the number of iterations is bounded and known at design time. Use `<ContinueAsNew>` when the workflow genuinely needs to run indefinitely or when total iteration count is unknown:

```tsx
export default smithers((ctx) => {
  const continuation = (ctx.input as any)?.__smithersContinuation as
    | { payload?: { generation?: number } }
    | undefined;

  const generation = continuation?.payload?.generation ?? 0;
  const MAX_PER_RUN = 50;

  return (
    <Workflow name="generational-worker">
      <Loop
        until={ctx.iterationCount("work", "do-work") >= MAX_PER_RUN}
        maxIterations={MAX_PER_RUN}
        onMaxReached="return-last"
      >
        <Task id="do-work" output={outputs.result} agent={workerAgent}>
          {`Generation ${generation}, iteration ${ctx.iteration}. Do the work.`}
        </Task>
      </Loop>
      <ContinueAsNew state={{ generation: generation + 1 }} />
    </Workflow>
  );
});
```

Each workflow run handles `MAX_PER_RUN` iterations via the loop, then hands off to a fresh run, keeping event history bounded in both dimensions.

## Behavior

- When the scheduler encounters `<ContinueAsNew>`, it signals the current run to close with status `continued`.
- The engine emits a `RunContinuedAsNew` event and immediately starts a new run of the same workflow.
- If `state` is provided, it is serialized to JSON. Non-serializable payloads fail the run synchronously at render time.
- The new run receives the payload at `ctx.input.__smithersContinuation.payload`.
- Any tasks or nodes rendered after `<ContinueAsNew>` in the same sequence do not execute — the handoff happens immediately.
- The workflow id is preserved across continuations. Only the run id increments.

## Rendering

`<ContinueAsNew>` renders as a `smithers:continue-as-new` host element.

## Notes

- Use `<ContinueAsNew>` for workflows that run indefinitely (daemons, pollers, event processors). For bounded iteration, [`<Loop>`](/components/loop) is simpler.
- The `state` payload must be JSON-serializable. Classes, functions, `undefined` nested inside objects, and circular references are not supported.
- The serialized continuation state (including the `__smithersContinuation` envelope) must be under **10 MB**. Exceeding this limit fails the run.
- Access the carried payload via `ctx.input.__smithersContinuation.payload`. Type-cast as needed since the input type does not include this field by default.
- Temporal imposes a maximum event history size (default 50,000 events / 50 MB). `<ContinueAsNew>` before approaching this limit is the recommended mitigation.
- The `continueAsNew(state?)` helper is interchangeable with `<ContinueAsNew state={state} />` — choose whichever reads more clearly in context.

---

## <Saga>

> Forward steps with registered compensations executed in reverse on failure or cancel.
> Source: https://smithers.sh/components/saga

```tsx
import { Saga } from "smithers-orchestrator";
```

## Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `id` | `string` | auto-generated | ID prefix for the [saga](https://microservices.io/patterns/data/saga.html). |
| `steps` | `SagaStepDef[]` | `undefined` | Array of step definitions with action and compensation elements. |
| `onFailure` | `"compensate" \| "compensate-and-fail" \| "fail"` | `"compensate"` | What to do when a step fails. |
| `skipIf` | `boolean` | `false` | Skip the entire saga. Returns `null`. |
| `children` | `React.ReactNode` | `undefined` | Alternative to `steps` — nest `<Saga.Step>` children. |

### SagaStepDef

| Field | Type | Description |
| --- | --- | --- |
| `id` | `string` | Unique step identifier. |
| `action` | `ReactElement` | The forward action to execute. |
| `compensation` | `ReactElement` | The rollback action on failure. |
| `label` | `string` | Optional display label. |

### Saga.Step Props

| Prop | Type | Description |
| --- | --- | --- |
| `id` | `string` | Step identifier. |
| `compensation` | `ReactElement` | Rollback element for this step. |
| `children` | `ReactElement` | The forward action element. |

## Basic usage with steps array

```tsx
<Workflow name="deploy-saga">
  <Saga
    id="deploy"
    steps={[
      {
        id: "create-pr",
        action: (
          <Task id="create-pr" output={outputs.pr} agent={codeAgent}>
            Create a pull request with the changes.
          </Task>
        ),
        compensation: (
          <Task id="close-pr" output={outputs.closePr} agent={codeAgent}>
            Close the pull request and clean up.
          </Task>
        ),
      },
      {
        id: "deploy-staging",
        action: (
          <Task id="deploy-staging" output={outputs.staging} agent={deployAgent}>
            Deploy to staging environment.
          </Task>
        ),
        compensation: (
          <Task id="rollback-staging" output={outputs.rollbackStaging} agent={deployAgent}>
            Rollback staging deployment.
          </Task>
        ),
      },
      {
        id: "deploy-prod",
        action: (
          <Task id="deploy-prod" output={outputs.prod} agent={deployAgent}>
            Deploy to production.
          </Task>
        ),
        compensation: (
          <Task id="rollback-prod" output={outputs.rollbackProd} agent={deployAgent}>
            Rollback production deployment.
          </Task>
        ),
      },
    ]}
  />
</Workflow>
```

## Declarative children syntax

Use `<Saga.Step>` children for a JSX-native API:

```tsx
<Workflow name="ticket-saga">
  <Saga id="ticket-flow">
    <Saga.Step
      id="create-ticket"
      compensation={
        <Task id="delete-ticket" output={outputs.deleteTicket} agent={ticketAgent}>
          Delete the created ticket.
        </Task>
      }
    >
      <Task id="create-ticket" output={outputs.ticket} agent={ticketAgent}>
        Create a Linear ticket for the feature.
      </Task>
    </Saga.Step>

    <Saga.Step
      id="create-branch"
      compensation={
        <Task id="delete-branch" output={outputs.deleteBranch} agent={gitAgent}>
          Delete the feature branch.
        </Task>
      }
    >
      <Task id="create-branch" output={outputs.branch} agent={gitAgent}>
        Create a feature branch from main.
      </Task>
    </Saga.Step>
  </Saga>
</Workflow>
```

## Failure modes

The `onFailure` prop controls what happens when a step fails:

- **`"compensate"`** (default) — Run compensation for all completed steps in reverse order. The saga resolves after compensation completes.
- **`"compensate-and-fail"`** — Run compensation, then propagate the failure to the parent [`<Workflow>`](/components/workflow).
- **`"fail"`** — Skip compensation entirely and fail immediately.

```tsx
<Saga id="strict-deploy" onFailure="compensate-and-fail" steps={steps} />
```

## How compensation works

When a step fails, the engine:

1. Enters compensation mode.
2. Identifies all completed steps (steps before the failed one).
3. Executes their compensation elements in reverse order.
4. Each compensation task receives the failed step's error context in its metadata.

For example, if step 3 of 5 fails, compensations for steps 2 and 1 run (in that order).

## Conditional skipping

```tsx
<Saga skipIf={ctx.input.dryRun} steps={steps} />
```

## Rendering

`<Saga>` renders to a `<smithers:saga>` host element. Action steps are mounted as sequential children. Compensation elements are stored as engine metadata and only mounted when a failure triggers rollback.

## Notes

- Steps execute sequentially — each step waits for the previous one to complete.
- Compensation elements should be idempotent when possible.
- The `steps` prop and `Saga.Step` children are mutually exclusive. If both are provided, `steps` takes priority.

---

## <TryCatchFinally>

> Workflow-scoped error boundaries with catch handlers and guaranteed cleanup.
> Source: https://smithers.sh/components/try-catch-finally

```tsx
import { TryCatchFinally } from "smithers-orchestrator";
```

## Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `id` | `string` | auto-generated | ID prefix for the error boundary. |
| `try` | `ReactElement` | **(required)** | The main workflow content. |
| `catch` | `ReactElement \| (error: SmithersError) => ReactElement` | `undefined` | Recovery handler mounted on failure. |
| `catchErrors` | `SmithersErrorCode[]` | all errors | Restrict which error codes trigger the catch handler. |
| `finally` | `ReactElement` | `undefined` | Always runs after try (success) or catch (failure). |
| `skipIf` | `boolean` | `false` | Skip the entire block. Returns `null`. |

## Basic usage

```tsx
<Workflow name="safe-deploy">
  <TryCatchFinally
    try={
      <Sequence>
        <Task id="build" output={outputs.build} agent={buildAgent}>
          Build the project.
        </Task>
        <Task id="deploy" output={outputs.deploy} agent={deployAgent}>
          Deploy to production.
        </Task>
      </Sequence>
    }
    catch={
      <Task id="notify-failure" output={outputs.notify} agent={notifyAgent}>
        Send a failure notification to the team.
      </Task>
    }
    finally={
      <Task id="cleanup" output={outputs.cleanup}>
        {{ cleanedUp: true }}
      </Task>
    }
  />
</Workflow>
```

## Dynamic catch handler

When `catch` is a function, it receives the `SmithersError` and returns a React element. This allows error-specific recovery:

```tsx
<TryCatchFinally
  try={
    <Task id="risky-op" output={outputs.riskyOp} agent={agent}>
      Perform a risky operation.
    </Task>
  }
  catch={(error) => (
    <Task id="recover" output={outputs.recover} agent={recoveryAgent}>
      {`Recover from error: ${error.code} — ${error.summary}`}
    </Task>
  )}
/>
```

## Filtering by error code

Use `catchErrors` to only catch specific error types. Unmatched errors propagate normally:

```tsx
<TryCatchFinally
  catchErrors={["TASK_TIMEOUT", "AGENT_CLI_ERROR"]}
  try={
    <Task id="flaky-task" output={outputs.flaky} agent={agent} timeoutMs={30000}>
      Run a task that might time out or have agent issues.
    </Task>
  }
  catch={
    <Task id="fallback" output={outputs.fallback}>
      {{ usedFallback: true }}
    </Task>
  }
/>
```

## Finally-only (guaranteed cleanup)

Omit `catch` to let errors propagate while still running cleanup:

```tsx
<TryCatchFinally
  try={
    <Task id="create-worktree" output={outputs.worktree} agent={gitAgent}>
      Create a temporary worktree.
    </Task>
  }
  finally={
    <Task id="cleanup-worktree" output={outputs.cleanup} agent={gitAgent}>
      Remove the temporary worktree.
    </Task>
  }
/>
```

## Execution flow

1. The `try` block runs first.
2. If any task in `try` fails:
   - If `catchErrors` is set, the error code is checked. Non-matching errors skip the catch handler.
   - If the error matches (or `catchErrors` is not set), the `catch` handler mounts.
   - If `catch` is a function, it receives the `SmithersError` and the returned element mounts.
3. The `finally` block always runs — after `try` succeeds or after `catch` completes.

## Conditional skipping

```tsx
<TryCatchFinally
  skipIf={!ctx.input.enableErrorHandling}
  try={mainWorkflow}
  catch={recoveryWorkflow}
/>
```

## Nesting

`<TryCatchFinally>` blocks can be nested. Inner blocks handle errors first; unhandled errors propagate to outer blocks:

```tsx
<TryCatchFinally
  try={
    <Sequence>
      <TryCatchFinally
        catchErrors={["TASK_TIMEOUT"]}
        try={
          <Task id="inner-task" output={outputs.inner} agent={agent} timeoutMs={10000}>
            Task that might time out.
          </Task>
        }
        catch={
          <Task id="retry-with-longer-timeout" output={outputs.retry} agent={agent} timeoutMs={60000}>
            Retry with a longer timeout.
          </Task>
        }
      />
      <Task id="next-step" output={outputs.next}>
        {{ continued: true }}
      </Task>
    </Sequence>
  }
  catch={
    <Task id="outer-catch" output={outputs.outerCatch} agent={notifyAgent}>
      Handle any error that the inner block did not catch.
    </Task>
  }
/>
```

## Rendering

`<TryCatchFinally>` renders to a `<smithers:try-catch-finally>` host element. The `try` block is always mounted as children. The `catch` and `finally` blocks are stored as engine metadata and mounted on demand.

## Notes

- The `try` prop accepts a single `ReactElement`. Wrap multiple elements in `<Sequence>` or `<Parallel>`.
- The `catch` handler receives the first matching error. Subsequent errors in the try block are ignored once catch is active.
- The `finally` block runs regardless of whether an error occurred, whether it was caught, or whether the catch handler itself fails.
- `catchErrors` accepts any `SmithersErrorCode` string. See the [error reference](/reference/errors) for the full list.

---

## <Aspects>

> Declarative cross-cutting concerns — token budgets, latency SLOs, and cost tracking — applied at workflow scope without modifying individual tasks.
> Source: https://smithers.sh/components/aspects

```tsx
import { Aspects } from "smithers-orchestrator";
```

Aspects wraps a section of the workflow tree and propagates budget constraints and tracking configuration to all descendant `<Task>` components. Individual tasks do not need to be modified — the constraints flow through React context.

## Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `tokenBudget` | `TokenBudgetConfig` | `undefined` | Token budget for all tasks in scope. |
| `latencySlo` | `LatencySloConfig` | `undefined` | Latency SLO for all tasks in scope. |
| `costBudget` | `CostBudgetConfig` | `undefined` | Cost budget in USD for all tasks in scope. |
| `tracking` | `TrackingConfig` | `{ tokens: true, latency: true, cost: true }` | Which metrics to track. |
| `children` | `ReactNode` | — | Workflow content these aspects apply to. |

### TokenBudgetConfig

| Field | Type | Default | Description |
| --- | --- | --- | --- |
| `max` | `number` | **(required)** | Maximum total tokens across all tasks in scope. |
| `perTask` | `number` | `undefined` | Optional per-task token limit. |
| `onExceeded` | `"fail" \| "warn" \| "skip-remaining"` | `"fail"` | Behavior when the budget is exceeded. |

### LatencySloConfig

| Field | Type | Default | Description |
| --- | --- | --- | --- |
| `maxMs` | `number` | **(required)** | Maximum total latency in ms across all tasks. |
| `perTask` | `number` | `undefined` | Optional per-task latency limit in ms. |
| `onExceeded` | `"fail" \| "warn"` | `"fail"` | Behavior when the SLO is exceeded. |

### CostBudgetConfig

| Field | Type | Default | Description |
| --- | --- | --- | --- |
| `maxUsd` | `number` | **(required)** | Maximum total cost in USD across all tasks. |
| `onExceeded` | `"fail" \| "warn" \| "skip-remaining"` | `"fail"` | Behavior when the budget is exceeded. |

### TrackingConfig

| Field | Type | Default | Description |
| --- | --- | --- | --- |
| `tokens` | `boolean` | `true` | Track token usage. |
| `latency` | `boolean` | `true` | Track latency. |
| `cost` | `boolean` | `true` | Track cost. |

## Basic usage

Wrap any section of your workflow with `<Aspects>` to apply budgets:

```tsx
import { createSmithers, Aspects } from "smithers-orchestrator";
import { z } from "zod";

const { Workflow, smithers, outputs } = createSmithers({
  analysis: z.object({ summary: z.string() }),
  review: z.object({ verdict: z.string() }),
});

export default smithers((ctx) => (
  <Workflow name="budgeted-workflow">
    <Aspects
      tokenBudget={{ max: 100_000, perTask: 25_000, onExceeded: "warn" }}
      latencySlo={{ maxMs: 30_000, onExceeded: "fail" }}
    >
      <Task id="analyze" output={outputs.analysis} agent={codeAgent}>
        Analyze the repository.
      </Task>
      <Task id="review" output={outputs.review} agent={reviewAgent}>
        Review the analysis.
      </Task>
    </Aspects>
  </Workflow>
));
```

Both tasks inherit the token budget and latency SLO without any per-task configuration.

## Cost tracking

Track and limit spending across a workflow scope:

```tsx
<Aspects
  costBudget={{ maxUsd: 5.00, onExceeded: "skip-remaining" }}
  tracking={{ cost: true, tokens: true, latency: false }}
>
  <Task id="expensive-analysis" output={outputs.analysis} agent={gpt4Agent}>
    Perform deep analysis.
  </Task>
  <Task id="summary" output={outputs.summary} agent={fastAgent}>
    Summarize the results.
  </Task>
</Aspects>
```

When the cost budget is exceeded with `onExceeded: "skip-remaining"`, subsequent tasks in the scope are skipped.

## Nesting

Aspects can be nested. Inner scopes inherit from outer scopes, with inner values taking precedence:

```tsx
<Aspects tokenBudget={{ max: 200_000 }}>
  <Task id="step1" output={outputs.step1} agent={agent}>
    Step 1
  </Task>

  <Aspects tokenBudget={{ max: 50_000, perTask: 10_000 }}>
    <Task id="step2" output={outputs.step2} agent={agent}>
      Step 2 — tighter budget
    </Task>
  </Aspects>
</Aspects>
```

The inner `<Aspects>` overrides `tokenBudget` but inherits any `latencySlo` or `costBudget` from the outer scope.

## Exceeded behavior

| `onExceeded` value | Behavior |
| --- | --- |
| `"fail"` | Task fails with `SmithersError` code `ASPECT_BUDGET_EXCEEDED`. Follows normal retry/`continueOnFail` behavior. |
| `"warn"` | Task completes normally. A warning event is emitted. |
| `"skip-remaining"` | Current task completes. Subsequent tasks in the Aspects scope are skipped. |

## How it works

`<Aspects>` is a React context provider. It creates an `AspectContext` that descendant `<Task>` components read during rendering. When the engine executes a task, it checks the attached aspect metadata for budget limits and tracking configuration.

The accumulator tracks running totals (tokens, latency, cost) across all tasks in the scope. Each task execution updates the accumulator, and subsequent tasks check against the configured limits.

## Notes

- Aspects only apply to tasks mounted as descendants in the React tree. Tasks outside the `<Aspects>` wrapper are unaffected.
- Budget tracking is per-run. Resuming a workflow resets the accumulator.
- The `tracking` prop controls which metrics are collected, not which budgets are enforced. You can set a `tokenBudget` without enabling token tracking (the budget still applies; the metric just is not recorded for observability).

---

## <SuperSmithers>

> Workflow wrapper that reads and modifies source code to intervene via hot reload, driven by a markdown strategy document.
> Source: https://smithers.sh/components/super-smithers

```tsx
import { SuperSmithers } from "smithers-orchestrator";
```

SuperSmithers is a composite component that orchestrates source-code intervention. Given a strategy document and an agent, it reads target files, proposes modifications, optionally applies them (triggering hot reload), and generates a report.

This component is unique because it modifies source files at runtime, leveraging the [hot reload](/guides/hot-reload) system to propagate changes back into the running workflow.

## Props

| Prop | Type | Default | Description |
| --- | --- | --- | --- |
| `id` | `string` | `"super-smithers"` | ID prefix for all generated internal task IDs. |
| `strategy` | `string \| ReactElement` | **(required)** | Markdown string or MDX component describing the intervention strategy. |
| `agent` | `AgentLike` | **(required)** | Agent that reads code and decides modifications. |
| `targetFiles` | `string[]` | `undefined` | Glob patterns of files the agent can modify. |
| `reportOutput` | `OutputTarget` | `undefined` | Output schema for the intervention report. |
| `dryRun` | `boolean` | `false` | If true, reports changes without applying them. |
| `skipIf` | `boolean` | `false` | Standard skip predicate. |

## Basic usage

```tsx
import { createSmithers, SuperSmithers } from "smithers-orchestrator";
import { z } from "zod";

const { Workflow, smithers, outputs } = createSmithers({
  report: z.object({
    filesChanged: z.array(z.string()),
    summary: z.string(),
  }),
});

const codeAgent = new Agent({
  model: anthropic("claude-sonnet-4-20250514"),
  instructions: "You are a senior software engineer performing code interventions.",
});

export default smithers((ctx) => (
  <Workflow name="intervention">
    <SuperSmithers
      id="refactor"
      strategy={`
        ## Refactoring Strategy

        1. Find all deprecated API calls in the target files
        2. Replace them with the new API equivalents
        3. Ensure all imports are updated
      `}
      agent={codeAgent}
      targetFiles={["src/**/*.ts"]}
      reportOutput={outputs.report}
    />
  </Workflow>
));
```

## MDX strategy

The strategy can be an MDX component for richer, parameterized documents:

```tsx
import RefactorStrategy from "./strategies/refactor.mdx";

<SuperSmithers
  id="refactor"
  strategy={<RefactorStrategy apiVersion="v2" targetModule="auth" />}
  agent={codeAgent}
  targetFiles={["src/auth/**/*.ts"]}
  reportOutput={outputs.report}
/>
```

## Dry run

Use `dryRun` to preview changes without applying them:

```tsx
<SuperSmithers
  id="preview"
  strategy={strategyDoc}
  agent={codeAgent}
  targetFiles={["src/**/*.ts"]}
  reportOutput={outputs.report}
  dryRun
/>
```

In dry-run mode, the apply step is skipped entirely. The agent still reads files and proposes modifications, but nothing is written to disk. The report describes what *would* change.

## Internal task structure

SuperSmithers expands to a sequence of four tasks (three in dry-run mode):

1. **`{id}-read`** — Agent reads the strategy document and target files, analyzes the codebase.
2. **`{id}-propose`** — Agent proposes specific code modifications based on the analysis.
3. **`{id}-apply`** *(skipped in dry-run)* — Compute task writes modifications to disk, triggering hot reload.
4. **`{id}-report`** — Agent generates a summary report of the intervention.

Each task depends on the previous one via `dependsOn`.

## Hot reload integration

When `dryRun` is false, the apply step writes modified files to disk. If the workflow is running in [hot mode](/guides/hot-reload), the file watcher detects the changes and triggers a hot reload cycle:

1. SuperSmithers apply task writes files
2. Hot reload watcher detects changes
3. Workflow module is re-imported from the overlay
4. Engine swaps the build function and re-renders

This creates a feedback loop where the agent can modify its own workflow definition at runtime.

## Notes

- SuperSmithers only operates meaningfully in hot-reload mode. Without hot reload, the apply step writes files but the running workflow does not pick up the changes until the next run.
- The `targetFiles` patterns are informational — they are included in the agent prompt to scope its analysis. File system access is governed by the agent's tool configuration.
- Each internal task uses the same `agent`. For different agents at each stage, compose the tasks manually using `<Task>` and `<Sequence>`.
- The `reportOutput` schema is used for the final report task. If not provided, internal string keys are used.

---

## Built-in Tools

> Sandboxed file and shell tools for AI agent tasks, with exact input schemas, security policies, and usage examples.
> Source: https://smithers.sh/integrations/tools

```ts
import { tools, read, write, edit, grep, bash, defineTool } from "smithers-orchestrator";
```

`tools` bundles all five tools keyed by name:

```ts
const { read, write, edit, grep, bash } = tools;
```

## Sandboxing

All tools are sandboxed to `rootDir` (defaults to the workflow directory). Paths are resolved relative to this root; escapes via symlinks are rejected.

| Policy | Behavior |
|---|---|
| Path resolution | Relative paths resolve against `rootDir`. Absolute paths must fall within root. |
| Symlinks | Rejected if target is outside sandbox. |
| Output size | Truncated to `maxOutputBytes` (default 200KB). |
| Timeouts | `bash` and `grep` default to 60s; exceeded processes killed with `SIGKILL`. |
| Network | `bash` blocks network commands by default. See [bash](#bash). |

## Tool Call Logging

Every invocation is logged to `_smithers_tool_calls`:

| Field | Description |
|---|---|
| `runId` | Workflow run ID |
| `nodeId` | Task node that invoked the tool |
| `iteration` | Loop iteration |
| `attempt` | Retry attempt number |
| `seq` | Sequential call counter within the task |
| `toolName` | `read`, `write`, `edit`, `grep`, or `bash` |
| `inputJson` | Serialized input arguments |
| `outputJson` | Serialized output (truncated if over limit) |
| `startedAtMs` | Start timestamp |
| `finishedAtMs` | End timestamp |
| `status` | `"success"` or `"error"` |
| `errorJson` | Error details (if `"error"`) |

## defineTool

Use `defineTool()` to wrap custom [AI SDK](https://ai-sdk.dev) tools with Smithers runtime context, deterministic idempotency keys, and durable tool-call logging.

```ts
import { defineTool } from "smithers-orchestrator";
import { z } from "zod";

const placeOrder = defineTool({
  name: "wholefoods.place_order",
  description: "Place a grocery order",
  schema: z.object({
    sku: z.string(),
  }),
  sideEffect: true,
  idempotent: false,
  async execute(args, ctx) {
    return await wholeFoods.placeOrder({
      sku: args.sku,
      idempotencyKey: ctx.idempotencyKey,
    });
  },
});
```

- `ctx.idempotencyKey` is stable across retries and resumes for the same task iteration.
- `sideEffect: true` opts the tool into Smithers side-effect tracking.
- `idempotent: false` tells Smithers to warn resumed/retried agent loops when the tool was already called in a previous attempt.
- Smithers logs start/finish records for every `defineTool()` call in `_smithers_tool_calls`.

### Side Effects and Idempotency

Every custom tool that modifies external state **must** declare `sideEffect: true`. This is how Smithers knows to protect your [workflow](/concepts/workflows-overview) during retries and resumes. Without it, Smithers treats the tool as a pure read and will replay it freely — potentially sending duplicate emails, double-charging payments, or creating duplicate records.

The two flags work together:

| `sideEffect` | `idempotent` | Smithers behavior |
|---|---|---|
| `false` (default) | `true` (default) | Pure read. Safe to replay on retry. No warnings. |
| `true` | `true` | Mutates external state, but calling it twice with the same input produces the same result (e.g. an upsert, a PUT request). Safe to replay. No warnings. |
| `true` | `false` | Mutates external state and is **not** safe to replay (e.g. sending an email, placing an order, charging a payment). On retry, Smithers injects a warning telling the agent the tool was already called and it should verify external state before calling it again. |

When `sideEffect: true` and `idempotent: false`, Smithers does two things on retry:

1. **Warns the agent.** The retry prompt includes a message listing which non-idempotent tools were already called, so the agent can check external state before repeating them.
2. **Provides a stable idempotency key.** `ctx.idempotencyKey` is deterministic for a given task + iteration, so you can pass it to external APIs that support idempotency (Stripe, AWS, etc.) to deduplicate on their end.

If your `execute` function has `sideEffect: true, idempotent: false` but does not accept the `ctx` parameter, Smithers logs a warning at startup. This is almost always a bug — you need `ctx.idempotencyKey` to safely handle retries.

```ts
// ✗ Bad: non-idempotent side effect without ctx
const sendEmail = defineTool({
  name: "email.send",
  schema: z.object({ to: z.string(), body: z.string() }),
  sideEffect: true,
  idempotent: false,
  async execute(args) {  // ← missing ctx parameter, Smithers warns
    await mailer.send(args);
  },
});

// ✓ Good: uses ctx.idempotencyKey to deduplicate
const sendEmail = defineTool({
  name: "email.send",
  schema: z.object({ to: z.string(), body: z.string() }),
  sideEffect: true,
  idempotent: false,
  async execute(args, ctx) {
    await mailer.send({ ...args, idempotencyKey: ctx.idempotencyKey });
  },
});
```

### What counts as a side effect

A side effect is any mutation of state **outside the [sandbox](/components/sandbox)**. If the tool talks to an external API, writes to a database, sends a message, or triggers a webhook, it has a side effect. Mark it.

File system changes inside the sandbox — writing files, editing code, running `git commit` — are **not** side effects in this sense. The built-in `write`, `edit`, and `bash` tools modify the working directory, but those changes are local, sandboxed, and tracked by git. They are inherently reversible (`git checkout`, `git reset`) and inspectable (`git diff`, `git log`). Smithers does not need retry warnings or idempotency keys for them.

| Tool | Side effect? | Why |
|---|---|---|
| Built-in `read`, `grep` | No | Pure reads |
| Built-in `write`, `edit` | No | Sandboxed file changes, tracked by git |
| Built-in `bash` (local commands) | No | Local execution within sandbox |
| Custom tool calling an external API | **Yes** | Mutates state outside the sandbox |
| Custom tool writing to a database | **Yes** | External persistent state |
| Custom tool sending a Slack message | **Yes** | Irreversible external communication |
| Custom tool creating a GitHub PR | **Yes** | External state visible to others |

The rule is simple: **if you cannot undo it with `git reset`, mark it as a side effect.**

---

## read

Read a file from the sandbox.

```ts
{ path: string }  // relative to rootDir or absolute
```

Returns file contents as UTF-8. Throws `"File too large"` if size exceeds `maxOutputBytes`.

```ts
import { ToolLoopAgent as Agent } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
import { read, grep } from "smithers-orchestrator";

const codeAgent = new Agent({
  model: anthropic("claude-sonnet-4-20250514"),
  tools: { read, grep },
});
```

```tsx
{/* outputs comes from createSmithers() */}
<Task id="review" output={outputs.review} agent={codeAgent}>
  Read the file src/auth.ts and identify any security vulnerabilities.
</Task>
```

---

## write

Write content to a file. Creates parent directories as needed.

```ts
{
  path: string      // relative to rootDir or absolute
  content: string
}
```

Returns `"ok"`. Throws `"Content too large"` if content exceeds `maxOutputBytes`. Logs content hash (SHA-256) and byte size; full content is not stored.

```ts
import { ToolLoopAgent as Agent } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
import { write, read } from "smithers-orchestrator";

const writerAgent = new Agent({
  model: anthropic("claude-sonnet-4-20250514"),
  tools: { write, read },
});
```

---

## edit

Apply a unified diff patch to an existing file.

```ts
{
  path: string    // file to patch
  patch: string   // unified diff format
}
```

Returns `"ok"`. The file must exist. Reads current contents, applies the patch via `applyPatch`, writes back. Throws on size limits (`"Patch too large"`, `"File too large"`) or mismatched context (`"Failed to apply patch"`). Logs patch hash and byte size.

```
--- a/src/auth.ts
+++ b/src/auth.ts
@@ -10,3 +10,4 @@
   const token = jwt.sign(payload, secret);
+  logger.info("Token issued", { userId: payload.sub });
   return token;
```

---

## grep

Search for a regex pattern using `ripgrep`.

```ts
{
  pattern: string    // regex
  path?: string      // directory or file (default: rootDir)
}
```

Returns matching lines with file paths and line numbers (`rg -n` format). Exit code 1 (no matches) returns empty string. Exit code 2 throws stderr as error. Requires `ripgrep` in PATH.

```
src/auth.ts:15:  if (token.expired()) {
src/auth.ts:42:  validateToken(token);
tests/auth.test.ts:8:  const token = createTestToken();
```

---

## bash

Execute a shell command.

```ts
{
  cmd: string                     // executable or command
  args?: string[]                 // arguments
  opts?: { cwd?: string }        // working directory (sandboxed)
}
```

Returns combined stdout and stderr. Working directory defaults to `rootDir`. Timeout: 60s (killed with `SIGKILL` via process group). Non-zero exit codes throw.

### Network Blocking

Controlled by `allowNetwork` in `RunOptions`, `--allow-network` on CLI, or server config. Default: blocked.

When blocked, the command string (executable + args) is checked against these fragments:

| Category | Blocked strings |
|---|---|
| HTTP clients | `curl`, `wget` |
| URL prefixes | `http://`, `https://` |
| Package managers | `npm`, `bun`, `pip` |
| Git remote ops | `git push`, `git pull`, `git fetch`, `git clone`, `git remote` |

Local git commands (`git status`, `git diff`, `git log`) are allowed.

```ts
import { ToolLoopAgent as Agent } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
import { bash } from "smithers-orchestrator";

const devAgent = new Agent({
  model: anthropic("claude-sonnet-4-20250514"),
  tools: { bash },
});
```

```tsx
{/* outputs comes from createSmithers() */}
<Task id="lint" output={outputs.lint} agent={devAgent}>
  Run the linter on src/ and report any issues.
</Task>
```

---

## Using Tools with Agents

Pass tools to an [AI SDK](https://ai-sdk.dev) agent, assign the agent to a [`<Task>`](/components/task):

```tsx
import { ToolLoopAgent as Agent } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
import { read, write, edit, grep, bash } from "smithers-orchestrator";

const codeAgent = new Agent({
  model: anthropic("claude-sonnet-4-20250514"),
  tools: { read, write, edit, grep, bash },
  instructions: "You are a senior engineer. Use the available tools to complete tasks.",
});

const { Workflow, smithers, outputs } = createSmithers({
  result: z.object({ summary: z.string() }),
});

export default smithers((ctx) => (
  <Workflow name="refactor">
    <Task id="refactor" output={outputs.result} agent={codeAgent}>
      {`Refactor the function in ${ctx.input.file} to improve readability.`}
    </Task>
  </Workflow>
));
```

The full bundle works too:

```ts
import { ToolLoopAgent as Agent } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
import { tools } from "smithers-orchestrator";

const fullAgent = new Agent({
  model: anthropic("claude-sonnet-4-20250514"),
  tools,
});
```

## Configuration

| Option | Default | Description |
|---|---|---|
| `rootDir` | Workflow directory | Sandbox root |
| `allowNetwork` | `false` | Allow network commands in `bash` |
| `maxOutputBytes` | `200000` (200KB) | Max output size per tool |
| `toolTimeoutMs` | `60000` (60s) | Timeout for `bash` and `grep` |

```ts
const result = await runWorkflow(workflow, {
  input: { file: "src/auth.ts" },
  rootDir: "/home/project",
  allowNetwork: false,
  maxOutputBytes: 500_000,
  toolTimeoutMs: 120_000,
});
```

## See Also

- [Agents and Tools](/concepts/agents-and-tools)
- [Sandbox](/components/sandbox)
- [Common External Tools](/integrations/common-tools)
- [Tools Agent Example](/examples/tools-agent)

---

## Integrations

> Connect Smithers workflows to Linear, Notion, Slack, Telegram, email, SMS, and other external systems through tools, CLI skills, MCP, or deterministic tasks.
> Source: https://smithers.sh/integrations/integrations

Smithers does **not** ship first-party clients for Linear, Notion, Slack, Telegram, email, or SMS.

Treat those systems as external integrations your application, skill, or CLI already owns. Smithers provides the orchestration layer around them:

- Durable workflows (`<Workflow>`, `<Task>`, `<Sequence>`, `<Parallel>`, `<Loop>`)
- SDK agents with custom tool objects
- CLI agents with skills, plugins, or MCP config
- Compute tasks for deterministic CLI or API calls

## Popular Integrations

| Service | Common actions | Best Smithers wiring |
|---|---|---|
| Linear | `getIssue`, `listIssues`, `comment`, `updateState` | SDK tool, CLI skill, or task calling your `linear` CLI |
| Notion | `search`, `getPage`, `createPage`, `appendBlock` | SDK tool, CLI skill, or task calling your `notion` CLI |
| Slack | `postMessage`, `replyInThread`, `listChannelHistory` | SDK tool, CLI MCP/server, or deterministic publish task |
| Telegram | `sendMessage`, `sendPhoto`, `pollUpdates` | SDK tool or deterministic bot task |
| Email | `listInbox`, `getThread`, `sendEmail` | SDK tool or deterministic task against your mail provider |
| SMS | `sendSms`, `listMessages`, `lookupNumber` | SDK tool or deterministic task against Twilio or another provider |

Same rule for all of them: keep the integration surface narrow and hand Smithers only the operations the workflow actually needs.

## Pattern 1: Pass tools to an SDK agent

Use this when the agent needs judgment, but the external system calls should stay explicit and reviewable.

```ts
import { ToolLoopAgent as Agent, tool, zodSchema } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
import { z } from "zod";

const linearGetIssue = tool({
  description: "Fetch a Linear issue",
  inputSchema: zodSchema(z.object({ id: z.string() })),
  execute: async ({ id }) => linearClient.getIssue(id),
});

const notionSearch = tool({
  description: "Search Notion pages",
  inputSchema: zodSchema(z.object({ query: z.string() })),
  execute: async ({ query }) => notionClient.search(query),
});

const slackPostMessage = tool({
  description: "Post a Slack message",
  inputSchema: zodSchema(z.object({ channel: z.string(), text: z.string() })),
  execute: async ({ channel, text }) => slackClient.postMessage(channel, text),
});

const telegramSendMessage = tool({
  description: "Send a Telegram message",
  inputSchema: zodSchema(z.object({ chatId: z.string(), text: z.string() })),
  execute: async ({ chatId, text }) => telegramClient.sendMessage(chatId, text),
});

const sendEmail = tool({
  description: "Send an email",
  inputSchema: zodSchema(z.object({
    to: z.string().email(),
    subject: z.string(),
    body: z.string(),
  })),
  execute: async (input) => emailClient.send(input),
});

const sendSms = tool({
  description: "Send an SMS",
  inputSchema: zodSchema(z.object({ to: z.string(), body: z.string() })),
  execute: async (input) => smsClient.send(input),
});

const opsAgent = new Agent({
  model: anthropic("claude-sonnet-4-20250514"),
  tools: {
    linearGetIssue,
    notionSearch,
    slackPostMessage,
    telegramSendMessage,
    sendEmail,
    sendSms,
  },
});
```

```tsx
{/* outputs comes from createSmithers() */}
<Task id="triage" output={outputs.triage} agent={opsAgent}>
  {`Read Linear issue ${ctx.input.issueId}, find the matching Notion spec, and decide whether to notify Slack, Telegram, email, or SMS.`}
</Task>
```

Put auth, retries, and provider-specific code in small helper modules such as `./integrations/linear.ts` or `./integrations/slack.ts`. Do not give the agent a full provider SDK if it only needs two or three actions.

## Pattern 2: Pass a skill, plugin, or MCP config to a CLI agent

Use this when your CLI agent already supports external integrations and Smithers should only orchestrate the task.

```ts
import { ClaudeCodeAgent, PiAgent, KimiAgent } from "smithers-orchestrator";

const claude = new ClaudeCodeAgent({
  model: "claude-sonnet-4-20250514",
  mcpConfig: ["./mcp.json"],
  pluginDir: ["./.claude/plugins"],
});

const pi = new PiAgent({
  provider: "openai",
  model: "gpt-5.2-codex",
  skill: ["./skills/linear", "./skills/notion"],
});

const kimi = new KimiAgent({
  model: "kimi-latest",
  mcpConfigFile: ["./mcp.json"],
  skillsDir: "./skills",
});
```

```tsx
<Task id="ticket-review" output={outputs.review} agent={pi}>
  {`Use the Linear skill to inspect ${ctx.input.issueId}, use the Notion skill to load the spec, then summarize next actions.`}
</Task>
```

Smithers manages retries, durable outputs, approvals, and sequencing. The CLI agent manages the Linear or Notion connection through its skill, plugin, or MCP layer.

## Pattern 3: Run the `linear` or `notion` CLI in a task

Use this when the step is deterministic and you do not need the model involved.

```tsx
<Task id="load-linear" output={outputs.linearIssue}>
  {async () => {
    const proc = Bun.spawn(["linear", /* your args here */], {
      stdout: "pipe",
      stderr: "pipe",
    });

    const stdout = await new Response(proc.stdout).text();
    const stderr = await new Response(proc.stderr).text();

    if (await proc.exited !== 0) {
      throw new Error(stderr || stdout);
    }

    return JSON.parse(stdout);
  }}
</Task>

<Task id="publish-notion" output={outputs.publishResult}>
  {async () => {
    const proc = Bun.spawn(["notion", /* your args here */], {
      stdout: "pipe",
      stderr: "pipe",
    });

    const stdout = await new Response(proc.stdout).text();
    const stderr = await new Response(proc.stderr).text();

    if (await proc.exited !== 0) {
      throw new Error(stderr || stdout);
    }

    return JSON.parse(stdout);
  }}
</Task>
```

Replace the argument arrays with the exact `linear` or `notion` commands your team already uses. The point is the pattern: compute task in, structured JSON out.

## React Hook Libraries

If you are building a React frontend on top of Smithers-backed routes or your own AI endpoints, these libraries fit well:

| Library | Use it for | Notes |
|---|---|---|
| `@ai-sdk/react` | Chat, completion, streamed objects, assistant-style UIs | Best default if your app already uses the Vercel AI SDK transport and UI stream format |
| `@tanstack/ai-react` | Typed chat clients, SSE adapters, tool approval flows, client tools, generation hooks | Good fit if you want TanStack-style client state and typed tool execution |
| `@tanstack/react-query` | Thread lists, run history, side panels, metadata, optimistic mutations | Complementary cache/query layer, not a replacement for chat streaming hooks |
| `@tambo-ai/react` | Generative React UI with provider-based hooks and thread state | Worth considering if your frontend is more component-generation than plain chat |

### Vercel AI SDK React Hooks

Use `@ai-sdk/react` when the client speaks the AI SDK UI protocol and you want batteries-included hooks such as `useChat`, `useCompletion`, `useObject`, and `useAssistant`.

```tsx
import { useChat } from "@ai-sdk/react";

export function ChatPanel() {
  const { messages, sendMessage, status } = useChat({
    api: "/api/chat",
  });

  return (
    <div>
      {messages.map((message) => (
        <div key={message.id}>{message.role}</div>
      ))}
      <button
        disabled={status !== "ready"}
        onClick={() => sendMessage({ text: "Summarize the latest Linear issue." })}
      >
        Send
      </button>
    </div>
  );
}
```

### TanStack AI

Use `@tanstack/ai-react` when you want a typed chat client with connection adapters, tool approval support, and TanStack-style generation hooks beyond chat.

```tsx
import { useChat, fetchServerSentEvents } from "@tanstack/ai-react";

export function ChatPanel() {
  const { messages, sendMessage, isLoading } = useChat({
    connection: fetchServerSentEvents("/api/chat"),
  });

  return (
    <div>
      {messages.map((message) => (
        <div key={message.id}>{message.role}</div>
      ))}
      <button disabled={isLoading} onClick={() => sendMessage("Summarize the latest Linear issue.")}>
        Send
      </button>
    </div>
  );
}
```

### TanStack Query and Beads

- `@tanstack/react-query` is useful alongside the chat hooks above for fetching Smithers runs, approvals, ticket metadata, user settings, and other non-streaming resources.
- Beads is **not** a React hook library. It is a persistent task and memory system for coding agents, so it belongs in agent/runtime tooling, not your React chat-hook layer.

## Choosing the Right Pattern

| If you need | Prefer |
|---|---|
| AI judgment over a small integration surface | SDK agent with narrow tools |
| Existing CLI ecosystem support | CLI agent with skills, plugins, or MCP |
| Deterministic sync or publish steps | Compute task calling the external CLI or API |

## Non-Existent APIs

Smithers does **not** ship:

- `smithers-orchestrator/linear`
- `smithers-orchestrator/notion`
- Built-in Slack, Telegram, email, or SMS clients
- Built-in webhook helpers for those services

## See Also

- [Agents and Tools](/concepts/agents-and-tools)
- [Built-in Tools](/integrations/tools)
- [CLI Agents](/integrations/cli-agents)
- [SDK Agents](/integrations/sdk-agents)

---

## Common External Tools

> Practical integration patterns for GitHub, Linear, Notion, Slack, and Obsidian using gateway, OpenAPI tools, and Smithers' built-in file tools.
> Source: https://smithers.sh/integrations/common-tools

Smithers is an orchestration framework, not a directory of first-party SaaS clients.

That is usually the right trade: you keep external integrations thin and explicit, and Smithers handles the sequencing, retries, persistence, approvals, and control flow around them.

For most teams, the winning pattern is one of these:

- gateway plus webhooks for event-driven systems
- `createOpenApiTools()` for APIs with an OpenAPI spec
- `defineTool()` for a small number of custom actions
- built-in file tools for local filesystems such as an Obsidian vault

## Quick Map

| System | Best Smithers wiring | Best for |
| --- | --- | --- |
| GitHub | Gateway plus webhook receiver | PRs, issues, comments, checks, bots |
| Linear | OpenAPI tools against an internal or generated spec | Triage, status updates, comments |
| Notion | OpenAPI tools against a Notion-facing spec or proxy | Spec lookup, page creation, block updates |
| Slack | Incoming webhooks plus gateway or OpenAPI/custom tools | Notifications, slash commands, interactive workflows |
| Obsidian | Built-in file tools or compute tasks | Reading and writing notes in a local vault |

## GitHub

GitHub is event-driven, so the gateway is the natural entry point.

Typical flow:

1. GitHub sends a webhook
2. Your receiver verifies the signature
3. The receiver calls `runs.create` or `signals.send`
4. The workflow calls GitHub back through tools or a client

```ts
await fetch("http://127.0.0.1:7331/rpc", {
  method: "POST",
  headers: {
    "content-type": "application/json",
    authorization: `Bearer ${process.env.GATEWAY_TOKEN}`,
  },
  body: JSON.stringify({
    method: "runs.create",
    params: {
      workflow: "github-pr-review",
      input: {
        owner: payload.repository.owner.login,
        repo: payload.repository.name,
        pullNumber: payload.pull_request.number,
      },
    },
  }),
});
```

Use GitHub when you need:

- PR review bots
- issue triage
- check-run orchestration
- comment-driven workflows with `signals.send`

For a full example, see [GitHub Bot](/integrations/github-bot).

## Linear

Linear's public API is GraphQL, so the smoothest OpenAPI story is usually a small internal REST proxy or generated OpenAPI surface for the subset of Linear actions you need.

That keeps the tool surface narrow and makes the agent's choices more predictable.

```ts
import { createOpenApiTools } from "smithers-orchestrator";

const linearTools = await createOpenApiTools("./specs/linear-proxy.openapi.json", {
  auth: {
    type: "bearer",
    token: process.env.LINEAR_API_KEY!,
  },
  include: [
    "getIssue",
    "listIssues",
    "commentOnIssue",
    "updateIssueState",
  ],
});
```

```tsx
<Task id="triage-linear" output={outputs.triage} agent={opsAgent}>
  Read the issue, decide the next state, and leave a short status comment.
</Task>
```

If you do not have an OpenAPI spec for your Linear surface, write a few `defineTool()` wrappers instead.

## Notion

Notion works well with the same pattern: define or generate an OpenAPI description for the pages and search endpoints your team actually uses, then hand that smaller surface to an agent.

```ts
const notionTools = await createOpenApiTools("./specs/notion.openapi.json", {
  auth: {
    type: "bearer",
    token: process.env.NOTION_TOKEN!,
  },
  include: [
    "searchPages",
    "getPage",
    "createPage",
    "appendBlockChildren",
  ],
});
```

```tsx
<Task id="publish-spec" output={outputs.publish} agent={writerAgent}>
  Find the matching spec page in Notion and append a release summary.
</Task>
```

Good Notion use cases:

- lookup supporting context before an agent decides something
- create or update project pages
- append release notes or meeting summaries

## Slack

Slack usually splits into two parts:

- inbound events or slash commands trigger workflows through the gateway
- outbound messages use either incoming webhooks or a Slack API client/tool

### Slash Command Or Event Receiver

```ts
await fetch("http://127.0.0.1:7331/rpc", {
  method: "POST",
  headers: {
    "content-type": "application/json",
    authorization: `Bearer ${process.env.GATEWAY_TOKEN}`,
  },
  body: JSON.stringify({
    method: "runs.create",
    params: {
      workflow: "slack-triage",
      input: {
        channel: command.channel_id,
        user: command.user_id,
        text: command.text,
      },
    },
  }),
});
```

### Outbound Notification

For simple posting, an incoming webhook is enough:

```tsx
<Task id="notify-slack" output={outputs.notify}>
  {async () => {
    await fetch(process.env.SLACK_WEBHOOK_URL!, {
      method: "POST",
      headers: { "content-type": "application/json" },
      body: JSON.stringify({
        text: "Workflow completed successfully.",
      }),
    });

    return { sent: true };
  }}
</Task>
```

For richer Slack API actions, use `createOpenApiTools()` against a proxy/spec or define a few custom tools.

## Obsidian

Obsidian is usually just a directory on disk, which means Smithers' built-in file tools are already enough.

```ts
import { ToolLoopAgent as Agent } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
import { read, write, edit, grep } from "smithers-orchestrator";

const obsidianAgent = new Agent({
  model: anthropic("claude-sonnet-4-20250514"),
  tools: { read, write, edit, grep },
});
```

```tsx
<Task id="update-vault-note" output={outputs.update} agent={obsidianAgent}>
  Read the note at vault/projects/release-plan.md, summarize the open tasks,
  and update the "Status" section in place.
</Task>
```

If you want stricter control than an agent with file tools, use a compute task:

```tsx
<Task id="append-daily-note" output={outputs.append}>
  {async () => {
    const path = `${process.env.OBSIDIAN_VAULT}/Daily/2026-04-09.md`;
    const previous = await Bun.file(path).text();
    await Bun.write(path, `${previous}\n- Smithers finished the release checklist.`);
    return { updated: true };
  }}
</Task>
```

## Choosing Between OpenAPI And Custom Tools

Use OpenAPI tools when:

- you already have a spec
- the surface area is medium or large
- you want the model to choose among many operations

Use `defineTool()` or a compute task when:

- the service has no usable OpenAPI spec
- you only need a few actions
- you want exact control over the network call

## Practical Advice

- Keep the external tool surface narrow
- Prefer gateway for inbound event streams
- Prefer OpenAPI tools for broad REST-style APIs
- Prefer custom tools for a few high-value operations
- Persist business state in outputs, not in third-party client caches

## Next Steps

- [Gateway](/integrations/gateway)
- [GitHub Bot](/integrations/github-bot)
- [Built-in Tools](/integrations/tools)
- [OpenAPI Tools](/concepts/openapi-tools)

---

## CLI Agents

> Run external AI CLI tools (Claude Code, Codex, Gemini CLI, PI, Kimi, Forge, Amp) as drop-in Smithers agents that implement the AI SDK agent interface.
> Source: https://smithers.sh/integrations/cli-agents

CLI-backed agent classes wrap external AI command-line tools. Each implements the [AI SDK](https://ai-sdk.dev) `Agent` interface and works anywhere Smithers accepts an agent, including [`<Task>`](/components/task).

The agent spawns the CLI, passes the prompt, captures output, and returns a `GenerateTextResult`.

For API-billed provider wrappers, see [SDK Agents](/integrations/sdk-agents).

## Import

```ts
import {
  ClaudeCodeAgent,
  CodexAgent,
  GeminiAgent,
  PiAgent,
  KimiAgent,
  ForgeAgent,
  AmpAgent,
  type PiAgentOptions,
  type PiExtensionUiRequest,
  type PiExtensionUiResponse,
} from "smithers-orchestrator";
```

## Prerequisites

| Agent | CLI Required | Install |
|---|---|---|
| `ClaudeCodeAgent` | `claude` | [Claude Code](https://docs.anthropic.com/en/docs/agents-and-tools/claude-code/overview) |
| `CodexAgent` | `codex` | [OpenAI Codex CLI](https://github.com/openai/codex) |
| `GeminiAgent` | `gemini` | [Gemini CLI](https://ai.google.dev) |
| `PiAgent` | `pi` | [PI Coding Agent](https://github.com/badlogic/pi-mono/tree/main/packages/coding-agent) |
| `KimiAgent` | `kimi` | [Kimi CLI](https://moonshotai.github.io/kimi-cli/) |
| `ForgeAgent` | `forge` | [Forge CLI](https://github.com/antinomyhq/forge) |
| `AmpAgent` | `amp` | [Amp CLI](https://github.com/nichochar/amp-cli) |

## Quick Start

```ts
import { ClaudeCodeAgent, CodexAgent, GeminiAgent, PiAgent, KimiAgent, ForgeAgent, AmpAgent } from "smithers-orchestrator";

const claude = new ClaudeCodeAgent({ model: "claude-sonnet-4-20250514" });
const codex = new CodexAgent({ model: "gpt-4.1" });
const gemini = new GeminiAgent({ model: "gemini-2.5-pro" });
const pi = new PiAgent({ provider: "openai", model: "gpt-5.2-codex" });
const kimi = new KimiAgent({ model: "kimi-latest" });
const forge = new ForgeAgent({ model: "anthropic/claude-sonnet-4-20250514" });
const amp = new AmpAgent({ model: "claude-sonnet-4-20250514" });
```

```tsx
{/* outputs comes from createSmithers() */}
<Task id="analysis" output={outputs.analysis} agent={claude}>
  {`Analyze the codebase and identify potential improvements.`}
</Task>
```

---

## Hijack Support

All built-in CLI agents support native-session hijack via `smithers hijack <runId>`.

| Agent | Hijack Mode | Native Relaunch |
|---|---|---|
| `ClaudeCodeAgent` | Native CLI session | `claude --resume <session>` |
| `CodexAgent` | Native CLI session | `codex resume <session> -C <cwd>` |
| `GeminiAgent` | Native CLI session | `gemini --resume <session>` |
| `PiAgent` | Native CLI session | `pi --session <session>` |
| `KimiAgent` | Native CLI session | `kimi --session <session> --work-dir <cwd>` |
| `ForgeAgent` | Native CLI session | `forge --conversation-id <id> -C <cwd>` |
| `AmpAgent` | Native CLI session | `amp threads continue <thread>` |

Behavior:

- Live run: Smithers waits until the agent is between blocking tool calls before aborting.
- Finished/cancelled run: Smithers reopens the latest persisted native session.
- If the hijacked session exits successfully, the workflow resumes automatically in detached mode.
- Cross-engine hijack is not supported.

Use `smithers hijack <runId> --launch=false` to inspect the resumable candidate without opening the session.

### Non-Idempotent Tool Resume Warning

When a [`<Task>`](/components/task) retries after a failure, previous attempts may have already executed side-effect tools (e.g., sending messages, creating PRs). Smithers detects non-idempotent tool calls from prior attempts and prepends a warning to the agent's prompt:

> Previous attempts in this task already called non-idempotent side-effect tools. Those side effects may already have happened before the interruption or retry. Do not blindly call them again. Verify external state first or continue from the prior result.

The warning includes the specific tool names and attempt numbers. It is automatically injected — no configuration is required.

---

## Base Options

```ts
type BaseCliAgentOptions = {
  id?: string;               // Agent ID (default: random UUID)
  model?: string;            // Model name to pass to the CLI
  systemPrompt?: string;     // System prompt prepended to the user prompt
  instructions?: string;     // Alias for systemPrompt
  cwd?: string;              // Working directory for the CLI process
  env?: Record<string, string>;  // Additional environment variables
  yolo?: boolean;            // Skip permission prompts (default: true)
  timeoutMs?: number;        // Hard wall-clock timeout in milliseconds
  idleTimeoutMs?: number;    // Inactivity timeout (no stdout/stderr) in milliseconds
  maxOutputBytes?: number;   // Max output capture size
  extraArgs?: string[];      // Additional CLI arguments appended to the command
};
```

| Option | Default | Description |
|---|---|---|
| `id` | Random UUID | Agent instance identifier |
| `model` | `undefined` | Model name passed to `--model` |
| `systemPrompt` | `undefined` | System instructions prepended to the prompt |
| `instructions` | `undefined` | Alias for `systemPrompt` |
| `cwd` | Tool context rootDir or `process.cwd()` | Working directory for the spawned process |
| `env` | `{}` | Extra environment variables merged with `process.env` |
| `yolo` | `true` | Skip all interactive permission prompts |
| `timeoutMs` | `undefined` | Hard wall-clock timeout; kills process after this many ms |
| `idleTimeoutMs` | `undefined` | Inactivity timeout; kills process after this many ms with no output |
| `maxOutputBytes` | `undefined` | Truncate captured output to this size |
| `extraArgs` | `[]` | Additional CLI flags |

### Timeouts

- `timeoutMs`: hard wall-clock cap.
- `idleTimeoutMs`: inactivity cap, resets on any stdout/stderr output.

Per-call override:

```ts
await agent.generate({
  prompt: "do the thing",
  timeout: { totalMs: 15 * 60 * 1000, idleMs: 2 * 60 * 1000 },
});
```

---

## ClaudeCodeAgent

Wraps `claude` CLI with `--print` mode.

```ts
const claude = new ClaudeCodeAgent({
  model: "claude-sonnet-4-20250514",
  systemPrompt: "You are a careful code reviewer.",
  timeoutMs: 30 * 60 * 1000,
  idleTimeoutMs: 2 * 60 * 1000,
});
```

### Claude-Specific Options

```ts
type ClaudeCodeAgentOptions = BaseCliAgentOptions & {
  addDir?: string[];
  agent?: string;
  agents?: Record<string, { description?: string; prompt?: string }> | string;
  allowDangerouslySkipPermissions?: boolean;
  allowedTools?: string[];
  appendSystemPrompt?: string;
  betas?: string[];
  chrome?: boolean;
  continue?: boolean;
  dangerouslySkipPermissions?: boolean;
  debug?: boolean | string;
  debugFile?: string;
  disableSlashCommands?: boolean;
  disallowedTools?: string[];
  fallbackModel?: string;
  file?: string[];
  forkSession?: boolean;
  fromPr?: string;
  ide?: boolean;
  includePartialMessages?: boolean;
  inputFormat?: "text" | "stream-json";
  jsonSchema?: string;
  maxBudgetUsd?: number;
  mcpConfig?: string[];
  mcpDebug?: boolean;
  noChrome?: boolean;
  noSessionPersistence?: boolean;
  outputFormat?: "text" | "json" | "stream-json";
  permissionMode?: "acceptEdits" | "bypassPermissions" | "default" | "delegate" | "dontAsk" | "plan";
  pluginDir?: string[];
  replayUserMessages?: boolean;
  resume?: string;
  sessionId?: string;
  settingSources?: string;
  settings?: string;
  strictMcpConfig?: boolean;
  tools?: string[] | "default" | "";
  verbose?: boolean;
};
```

| Option | Description |
|---|---|
| `permissionMode` | `"bypassPermissions"`, `"acceptEdits"`, `"default"`, `"delegate"`, `"dontAsk"`, `"plan"` |
| `allowedTools` | Tool name whitelist |
| `disallowedTools` | Tool name blacklist |
| `disableSlashCommands` | Disable all slash commands |
| `maxBudgetUsd` | Spending cap in USD |
| `mcpConfig` | [Model Context Protocol](https://modelcontextprotocol.io) server configuration files |
| `mcpDebug` | Enable MCP debug logging |
| `addDir` | Additional context directories |
| `file` | Files to inject into context |
| `fromPr` | Pull request URL or number to use as additional context |
| `fallbackModel` | Model to use if the primary model is unavailable |
| `appendSystemPrompt` | Text appended to the system prompt |
| `agents` | Multi-agent configuration as a map of agent definitions or JSON string |
| `betas` | Beta feature flags to enable |
| `pluginDir` | Plugin directories for Claude Code skills |
| `resume` / `sessionId` | Resume a previous session by ID |
| `settings` / `settingSources` | Override settings file or sources |
| `jsonSchema` | JSON schema string for structured output |
| `includePartialMessages` | Stream partial assistant messages |
| `inputFormat` | `"text"` or `"stream-json"` for input |
| `outputFormat` | `"text"`, `"json"`, or `"stream-json"` (default: `"stream-json"`) |

When `yolo` is `true` (default), the agent passes `--allow-dangerously-skip-permissions`, `--dangerously-skip-permissions`, and `--permission-mode bypassPermissions` unless `permissionMode` is explicitly set.

### PR Context

The `fromPr` option passes `--from-pr <value>` to the Claude CLI, loading the diff and metadata of the specified pull request into the conversation context. Accepts a PR URL or number:

```ts
const claude = new ClaudeCodeAgent({
  model: "claude-sonnet-4-20250514",
  fromPr: "https://github.com/org/repo/pull/42",
});
```

Smithers does not fetch the PR itself; the Claude CLI resolves and loads it.

---

## CodexAgent

Wraps `codex` CLI using `codex exec` with stdin input.

```ts
const codex = new CodexAgent({
  model: "gpt-4.1",
  sandbox: "workspace-write",
  fullAuto: true,
});
```

### Codex-Specific Options

```ts
type CodexAgentOptions = BaseCliAgentOptions & {
  config?: Record<string, string | number | boolean | object | null> | string[];
  enable?: string[];
  disable?: string[];
  image?: string[];
  oss?: boolean;
  localProvider?: string;
  sandbox?: "read-only" | "workspace-write" | "danger-full-access";
  profile?: string;
  fullAuto?: boolean;
  dangerouslyBypassApprovalsAndSandbox?: boolean;
  cd?: string;
  skipGitRepoCheck?: boolean;
  addDir?: string[];
  outputSchema?: string;
  color?: "always" | "never" | "auto";
  json?: boolean;
  outputLastMessage?: string;
};
```

| Option | Description |
|---|---|
| `sandbox` | `"read-only"`, `"workspace-write"`, or `"danger-full-access"` |
| `fullAuto` | Full auto mode (no confirmations) |
| `dangerouslyBypassApprovalsAndSandbox` | Skip all [approval](/concepts/approvals) prompts and [sandbox](/components/sandbox) restrictions |
| `config` | Configuration overrides as key-value pairs or raw strings |
| `oss` | Use open-source models |
| `localProvider` | Local model provider URL |
| `image` | Image file paths to include as visual inputs |
| `outputSchema` | Path to JSON schema file for structured output |
| `outputLastMessage` | File path to write the last message (auto-generated if not set) |

When `yolo` is `true` and `fullAuto` is not set, passes `--dangerously-bypass-approvals-and-sandbox`. If `fullAuto` is `true`, uses `--full-auto` instead.

Prompt is passed via stdin using the `-` argument.

---

## GeminiAgent

Wraps the `gemini` CLI.

```ts
const gemini = new GeminiAgent({
  model: "gemini-2.5-pro",
  sandbox: true,
  allowedTools: ["read_file", "write_file"],
});
```

### Gemini-Specific Options

```ts
type GeminiAgentOptions = BaseCliAgentOptions & {
  debug?: boolean;
  sandbox?: boolean;
  approvalMode?: "default" | "auto_edit" | "yolo" | "plan";
  experimentalAcp?: boolean;
  allowedMcpServerNames?: string[];
  allowedTools?: string[];
  extensions?: string[];
  listExtensions?: boolean;
  resume?: string;
  listSessions?: boolean;
  deleteSession?: string;
  includeDirectories?: string[];
  screenReader?: boolean;
  outputFormat?: "text" | "json" | "stream-json";
};
```

| Option | Description |
|---|---|
| `sandbox` | Run in [sandbox](/components/sandbox) mode |
| `approvalMode` | `"default"`, `"auto_edit"`, `"yolo"`, or `"plan"` |
| `allowedTools` | Tool name whitelist |
| `allowedMcpServerNames` | MCP server name whitelist |
| `extensions` | Gemini CLI extensions to load |
| `resume` | Resume a previous session by ID |
| `listSessions` / `deleteSession` | Session management |
| `includeDirectories` | Additional directories to include |
| `outputFormat` | `"text"`, `"json"`, or `"stream-json"` (default: `"json"`) |

When `yolo` is `true` and `approvalMode` is not set, passes `--yolo`.

Prompt is passed via `--prompt`.

### gcloud Authentication

When neither `GOOGLE_API_KEY` nor `GEMINI_API_KEY` is set, Gemini CLI uses `gcloud` application-default credentials. The diagnostics `api_key_valid` check falls back to running `gcloud auth print-access-token` to confirm that gcloud auth is configured. No extra options are required — the Gemini CLI picks up the credentials automatically from the environment:

```bash
gcloud auth application-default login
```

```ts
// No API key needed when gcloud auth is configured
const gemini = new GeminiAgent({ model: "gemini-2.5-pro" });
```

---

## PiAgent

Wraps the `pi` CLI.

```ts
const pi = new PiAgent({
  provider: "openai",
  model: "gpt-5.2-codex",
  mode: "text",
  noSession: true,
});
```

### PI-Specific Options

```ts
type PiAgentOptions = BaseCliAgentOptions & {
  provider?: string;
  model?: string;
  apiKey?: string;
  systemPrompt?: string;
  appendSystemPrompt?: string;
  mode?: "text" | "json" | "rpc";
  print?: boolean;
  continue?: boolean;
  resume?: boolean;
  session?: string;
  sessionDir?: string;
  noSession?: boolean;
  models?: string | string[];
  listModels?: boolean | string;
  tools?: string[];
  noTools?: boolean;
  extension?: string[];
  noExtensions?: boolean;
  skill?: string[];
  noSkills?: boolean;
  promptTemplate?: string[];
  noPromptTemplates?: boolean;
  theme?: string[];
  noThemes?: boolean;
  thinking?: "off" | "minimal" | "low" | "medium" | "high" | "xhigh";
  export?: string;
  files?: string[];
  verbose?: boolean;
  onExtensionUiRequest?: (request: PiExtensionUiRequest) =>
    | Promise<PiExtensionUiResponse | null>
    | PiExtensionUiResponse
    | null;
};
```

| Option | Description |
|---|---|
| `provider` | PI provider name (`--provider`) |
| `model` | PI model (`--model`) |
| `apiKey` | Passed to `--api-key` (prefer env/config for secrets) |
| `mode` | `text`, `json`, or `rpc` |
| `print` | Force `--print` in text mode |
| `continue` / `resume` / `session` | Session continuation controls |
| `sessionDir` | Custom session directory |
| `models` / `listModels` | Scoped model patterns and listing |
| `extension` | Extension path(s) |
| `skill` | Skill path(s) |
| `promptTemplate` | Prompt template path(s) |
| `theme` | Theme path(s) |
| `tools` / `noTools` | Enable specific tools or disable built-ins |
| `export` | Export session HTML |
| `files` | File args passed as `@path` (text/json modes) |
| `onExtensionUiRequest` | RPC-only handler for extension UI requests |
| `noSession` | Disable session persistence (default `true` unless session flags set) |

In text/json modes, the prompt is a positional argument and `files` emit as `@path` arguments. In rpc mode, the prompt is sent as JSON over stdin. Text mode defaults to `--print` without `--mode`; json/rpc set `--mode` and omit `--print`.

For workflow hijack, Smithers automatically uses PI's structured event stream and keeps session persistence enabled regardless of `noSession`.

---

## KimiAgent

Wraps `kimi` CLI using `--print` mode.

```ts
const kimi = new KimiAgent({
  model: "kimi-latest",
  thinking: true,
  timeoutMs: 300_000,
});
```

### Kimi-Specific Options

```ts
type KimiAgentOptions = BaseCliAgentOptions & {
  workDir?: string;
  session?: string;
  continue?: boolean;
  thinking?: boolean;
  outputFormat?: "text" | "stream-json";
  finalMessageOnly?: boolean;
  quiet?: boolean;
  agent?: "default" | "okabe";
  agentFile?: string;
  mcpConfigFile?: string[];
  mcpConfig?: string[];
  skillsDir?: string;
  maxStepsPerTurn?: number;
  maxRetriesPerStep?: number;
  maxRalphIterations?: number;
  verbose?: boolean;
  debug?: boolean;
};
```

| Option | Description |
|---|---|
| `thinking` | Enable/disable thinking mode |
| `outputFormat` | `"text"` or `"stream-json"` (default: `"text"`) |
| `finalMessageOnly` | Only print the final assistant message |
| `quiet` | Alias for `--print --output-format text --final-message-only` |
| `agent` | Built-in agent spec: `"default"` or `"okabe"` |
| `agentFile` | Path to custom agent specification file |
| `workDir` | Override the working directory for the kimi process |
| `session` / `continue` | Session resumption and continuation |
| `skillsDir` | Skills directory path |
| `mcpConfigFile` / `mcpConfig` | MCP config file(s) or inline config |
| `maxStepsPerTurn` | Max steps in one turn |
| `maxRetriesPerStep` | Max retries in one step |
| `maxRalphIterations` | Extra iterations after the first turn in Loop mode |

When `yolo` is `true` (default), passes `--print` which implicitly adds `--yolo`.

Prompt is passed via `--prompt`.

### Isolated Share Directory

Kimi stores per-session metadata in `~/.kimi/` (or `$KIMI_SHARE_DIR`). When running parallel tasks, concurrent writes to this directory can corrupt `kimi.json`. `KimiAgent` automatically creates an isolated temporary directory per invocation, copies `config.toml`, `credentials`, `device_id`, and `latest_version.txt` from the default share dir, and sets `KIMI_SHARE_DIR` to the temporary copy. The directory is removed via the cleanup hook when the run completes.

To opt out of isolation and use a specific directory, set `KIMI_SHARE_DIR` in `env`:

```ts
const kimi = new KimiAgent({
  model: "kimi-latest",
  env: { KIMI_SHARE_DIR: "/path/to/shared-kimi" },
});
```

---

## ForgeAgent

Wraps `forge` CLI. Supports 300+ models via `--prompt`.

```ts
const forge = new ForgeAgent({
  model: "anthropic/claude-sonnet-4-20250514",
  provider: "anthropic",
  directory: "/path/to/project",
});
```

### Forge-Specific Options

```ts
type ForgeAgentOptions = BaseCliAgentOptions & {
  directory?: string;       // -C, --directory <DIR>
  provider?: string;        // --provider <PROVIDER>
  agent?: string;           // --agent <AGENT>
  conversationId?: string;  // --conversation-id <ID>
  sandbox?: string;         // --sandbox <NAME>
  restricted?: boolean;     // -r, --restricted
  verbose?: boolean;        // --verbose
  workflow?: string;        // -w, --workflow <FILE>
  event?: string;           // -e, --event <JSON>
  conversation?: string;    // --conversation <FILE>
};
```

| Option | Description |
|---|---|
| `directory` | Working directory (`-C`); defaults to `cwd` |
| `provider` | Model provider name |
| `agent` | Agent type |
| `conversationId` | Resume conversation by ID |
| `sandbox` | Sandbox name |
| `restricted` | Enable restricted mode |
| `workflow` | Workflow file path |
| `event` | Event JSON for workflow triggers |
| `conversation` | Conversation file path |

Forge `--prompt` mode auto-approves tool use; no separate yolo flag.

Prompt is passed via `--prompt`.

---

## AmpAgent

Wraps `amp` CLI using `--execute` mode.

```ts
const amp = new AmpAgent({
  model: "claude-sonnet-4-20250514",
  visibility: "private",
  logLevel: "info",
});
```

### Amp-Specific Options

```ts
type AmpAgentOptions = BaseCliAgentOptions & {
  visibility?: "private" | "public" | "workspace" | "group";
  mcpConfig?: string;
  settingsFile?: string;
  logLevel?: "error" | "warn" | "info" | "debug" | "audit";
  logFile?: string;
  dangerouslyAllowAll?: boolean;
  ide?: boolean;
  jetbrains?: boolean;
};
```

| Option | Description |
|---|---|
| `visibility` | Thread visibility: `"private"`, `"public"`, `"workspace"`, `"group"` |
| `mcpConfig` | MCP configuration file path |
| `settingsFile` | Custom settings file path |
| `logLevel` | `"error"`, `"warn"`, `"info"`, `"debug"`, `"audit"` |
| `logFile` | Log output file path |
| `dangerouslyAllowAll` | Allow all tool calls without confirmation |

When `yolo` is `true` (default) or `dangerouslyAllowAll` is `true`, passes `--dangerously-allow-all`.

Prompt is passed via `--execute`. Automatically passes `--no-ide`, `--no-jetbrains`, `--no-color`, and `--archive` for headless execution.

---

## Diagnostics

Before each run, Smithers launches a diagnostic probe concurrently with the agent process. If the agent fails, the probe's findings are attached to the error and printed as a warning.

```ts
// Diagnostics run automatically — no configuration required.
// On failure, err.details.diagnostics contains the full DiagnosticReport.
try {
  await claude.generate({ prompt: "..." });
} catch (err) {
  // err.details.diagnostics.checks contains the individual check results
}
```

Each `DiagnosticReport` contains:

```ts
type DiagnosticReport = {
  agentId: string;        // e.g. "claude-code"
  command: string;        // e.g. "claude"
  timestamp: string;      // ISO 8601
  checks: DiagnosticCheck[];
  durationMs: number;
};

type DiagnosticCheck = {
  id: "cli_installed" | "api_key_valid" | "rate_limit_status";
  status: "pass" | "fail" | "skip" | "error";
  message: string;
  detail?: Record<string, unknown>;
  durationMs: number;
};
```

### CLI Installed Check

The `cli_installed` check runs `which <command>` to confirm the binary is on `PATH`.

- **pass** — binary found; `detail.binaryPath` contains the resolved path.
- **fail** — binary not found; install the CLI listed in Prerequisites.

### API Key Check

The `api_key_valid` check verifies the API credential for each provider.

| Agent | Env var checked | Method |
|---|---|---|
| `ClaudeCodeAgent` | `ANTHROPIC_API_KEY` | Format check (`sk-ant-*`); absent = subscription mode (pass) |
| `CodexAgent` | `OPENAI_API_KEY` | `GET /v1/models` |
| `GeminiAgent` | `GOOGLE_API_KEY` or `GEMINI_API_KEY` | `GET /v1beta/models`; falls back to `gcloud auth` |
| `AmpAgent` | — | Skipped (Amp manages its own auth) |

### Rate Limit Check

The `rate_limit_status` check probes the provider's API for current quota headroom.

- Reads standard rate-limit headers (`anthropic-ratelimit-*`, `x-ratelimit-*`).
- Status is `skip` when using gcloud auth or subscription mode.
- If the check passed before the run but the error text contains rate-limit patterns (e.g. `429`, `too many requests`, `quota exceeded`), the check is upgraded to `fail` post-hoc and attached to the error.

---

## Capability Registry

Every CLI agent exposes a `capabilities` property that describes its tool surface. Smithers uses this at runtime to normalize tool names and verify that the agent configuration is self-consistent.

```ts
console.log(claude.capabilities);
// {
//   version: 1,
//   engine: "claude-code",
//   runtimeTools: {},
//   mcp: { bootstrap: "project-config", supportsProjectScope: true, supportsUserScope: true },
//   skills: { supportsSkills: true, installMode: "plugin", smithersSkillIds: [] },
//   humanInteraction: { supportsUiRequests: false, methods: [] },
//   builtIns: ["default", "slash-commands"]
// }
```

### Normalization

`normalizeCapabilityRegistry` canonicalizes a registry before comparison or hashing: string lists are deduplicated and sorted, tool descriptor fields are trimmed, and empty optional values are removed.

```ts
import { normalizeCapabilityRegistry } from "smithers-orchestrator";

const canonical = normalizeCapabilityRegistry(agent.capabilities);
```

`normalizeCapabilityStringList` applies the same rules to any standalone string array:

```ts
import { normalizeCapabilityStringList } from "smithers-orchestrator";

normalizeCapabilityStringList(["!bash", "default", "default", " web_search "])
// ["!bash", "default", "web_search"]
```

### Hashing

`hashCapabilityRegistry` produces a stable SHA-256 hex fingerprint of the normalized registry. Use it to detect configuration drift between agent invocations or CI runs.

```ts
import { hashCapabilityRegistry } from "smithers-orchestrator";

const fingerprint = hashCapabilityRegistry(agent.capabilities);
// "a3f1c9..."
```

The hash is also returned in `getCliAgentCapabilityReport()` as `entry.fingerprint`.

### Capability Doctor

`getCliAgentCapabilityDoctorReport()` validates every built-in CLI agent's registry against consistency rules and returns a report with per-agent issues:

```ts
import { getCliAgentCapabilityDoctorReport } from "smithers-orchestrator";

const report = getCliAgentCapabilityDoctorReport();
if (!report.ok) {
  console.error(formatCliAgentCapabilityDoctorReport(report));
}
```

---

## Agent Contract

The agent contract describes the Smithers [MCP server](/integrations/mcp-server) tool surface that is injected into an agent's context. It is separate from the capability registry — the registry describes what the agent *can* do, while the contract describes what Smithers *exposes* to the agent.

### Raw vs. Semantic Tool Surface

`SmithersToolSurface` is `"raw"` or `"semantic"`. The semantic surface groups and renames tools to reduce noise for general-purpose agents. The raw surface exposes every tool name as-is.

```ts
import { buildSmithersMcpConfigFile } from "smithers-orchestrator";

const { path, cleanup } = buildSmithersMcpConfigFile("semantic");
// Writes a temporary mcp.json pointing at the Smithers MCP server
```

### Live MCP Tool Probe

`listLiveSmithersMcpTools` starts the Smithers [MCP server](/integrations/mcp-server) in a subprocess and calls `tools/list` to retrieve the live tool set. Use this to build a contract from the actual running server rather than a static snapshot.

```ts
import { listLiveSmithersMcpTools } from "smithers-orchestrator";

const tools = await listLiveSmithersMcpTools({ toolSurface: "semantic" });
```

`probeSmithersAgentContract` wraps the probe and returns a full `SmithersAgentContract`:

```ts
import { probeSmithersAgentContract } from "smithers-orchestrator";

const contract = await probeSmithersAgentContract({ toolSurface: "semantic" });
```

### Prompt Guidance

`contract.promptGuidance` is a compact, instruction-friendly string listing available tools grouped by category. Inject it into an agent's system prompt:

```ts
const claude = new ClaudeCodeAgent({
  systemPrompt: contract.promptGuidance,
});
```

Example output:

```
You have access to the live Smithers semantic MCP surface on server "smithers".
Only rely on the tool names listed here.
For workflow discovery and launch, use `list_workflows`, `run_workflow`.
For run inspection and control, use `cancel`, `get_run`, `list_runs`.
Potentially destructive tools: `cancel`, `run_workflow`. Confirm intent before using them.
```

### Docs Guidance

`contract.docsGuidance` is a Markdown table listing every tool with its category, destructive flag, and description. Suitable for injecting into documentation or longer context windows:

```ts
console.log(contract.docsGuidance);
// ## Smithers semantic Tool Surface
// | Tool | Category | Destructive | Description |
// | --- | --- | --- | --- |
// | `list_workflows` | workflows | no | List available workflows. |
// ...
```

---

## Token and Usage Tracking

Smithers extracts token usage from raw CLI output and populates the `usage` field of the returned `GenerateTextResult`. This works across all built-in agents without additional configuration.

```ts
const result = await claude.generate({ prompt: "..." });
console.log(result.usage);
// {
//   inputTokens: 1024,
//   outputTokens: 512,
//   inputTokenDetails: { cacheReadTokens: 128, cacheWriteTokens: 64 },
//   outputTokenDetails: { reasoningTokens: 0 },
//   totalTokens: 1536
// }
```

### Usage Extraction

`extractUsageFromOutput` parses the raw CLI stdout to find token counts. The extraction strategy is format-specific:

| Agent / Format | Source |
|---|---|
| `ClaudeCodeAgent` `stream-json` | `message_start.message.usage` (input) + `message_delta.usage` (output) |
| `CodexAgent` `--json` | `turn.completed.usage` |
| `GeminiAgent` `json` | `stats.models[*].tokens` |
| Generic NDJSON | Any line with a `usage` object containing `input_tokens` / `output_tokens` |

Cache read tokens (`cache_read_input_tokens`, `cached_input_tokens`), cache write tokens (`cache_creation_input_tokens`), and reasoning tokens (`reasoning_tokens`) are accumulated when present.

---

## BaseCliAgent Internals

### Cleanup Hook

`CliCommandSpec.cleanup` is an optional `async () => void` returned by `buildCommand`. It runs after the agent process exits, whether the run succeeds or fails. Use it to remove temporary files:

```ts
// KimiAgent uses this pattern internally:
return {
  command: "kimi",
  args,
  env: { KIMI_SHARE_DIR: isolatedDir },
  cleanup: async () => {
    rmSync(isolatedDir, { recursive: true, force: true });
  },
};
```

The cleanup runs under `Effect.ensuring`, so it is guaranteed to execute even when the command throws.

### Stdout Error Detection

Some CLIs exit with code 0 but print an error message to stdout. The `stdoutErrorPatterns` field on `CliCommandSpec` accepts an array of `RegExp` patterns. If any pattern matches the cleaned stdout text (after banner stripping), the agent throws `AGENT_CLI_ERROR` with the matched content as the message:

```ts
return {
  command: "mycli",
  args,
  stdoutErrorPatterns: [/^Error:/m, /authentication failed/i],
};
```

Detection is skipped when stdout starts with `{` or `[` (i.e., JSON output).

### Banner Stripping

CLI tools occasionally print version banners, update notices, or telemetry lines to stdout before the model response. The `stdoutBannerPatterns` field on `CliCommandSpec` accepts an array of `RegExp` patterns that are stripped from stdout before text extraction:

```ts
return {
  command: "mycli",
  args,
  stdoutBannerPatterns: [/^mycli v\d+\.\d+\.\d+.*\n/m],
  errorOnBannerOnly: true, // throw if only a banner was printed (no model response)
};
```

---

## Agent Interface

All CLI agents implement two methods.

### `generate(options)`

Runs the CLI synchronously and returns a `GenerateTextResult`:

```ts
const result = await claude.generate({
  prompt: "Explain the architecture of this codebase.",
});
console.log(result.text);
```

1. Extracts prompt from `options.prompt` (string) or `options.messages` (array).
2. Builds the CLI command with all configured flags.
3. Spawns the process and captures stdout/stderr.
4. For `json`/`stream-json` output, extracts text from the JSON payload.
5. Returns the result as a `GenerateTextResult`.

### `stream(options)`

Calls `generate()` internally and wraps the result as a `StreamTextResult`. Not truly streamed.

```ts
const stream = await claude.stream({ prompt: "Review this code." });
for await (const chunk of stream.textStream) {
  process.stdout.write(chunk);
}
```

---

## Message Handling

When called with messages, agents convert them to a text prompt:

- System messages are extracted and prepended as a system prompt.
- User/assistant messages are formatted as `ROLE: content`, joined with double newlines.
- Message system prompt is combined with any `systemPrompt` on the agent instance.

---

## Example: Multi-Agent Workflow

```tsx
import { ClaudeCodeAgent, CodexAgent } from "smithers-orchestrator";

const reviewer = new ClaudeCodeAgent({
  model: "claude-sonnet-4-20250514",
  systemPrompt: "You are a thorough code reviewer.",
  timeoutMs: 120_000,
});

const fixer = new CodexAgent({
  model: "gpt-4.1",
  fullAuto: true,
  timeoutMs: 180_000,
});

const { Workflow, smithers, outputs } = createSmithers({
  review: z.object({ summary: z.string() }),
  fix: z.object({ result: z.string() }),
});

export default smithers((ctx) => (
  <Workflow name="review-and-fix">
    <Task id="review" output={outputs.review} agent={reviewer}>
      {`Review the changes in this PR and identify issues.`}
    </Task>
    <Task id="fix" output={outputs.fix} agent={fixer}>
      {`Fix these issues: ${ctx.output(outputs.review, { nodeId: "review" }).summary}`}
    </Task>
  </Workflow>
));
```

## Next Steps

- [SDK Agents](/integrations/sdk-agents)
- [MCP Server](/integrations/mcp-server)
- [Agents and Tools](/concepts/agents-and-tools)
- [Multi-Agent Review Example](/examples/multi-agent-review)

---

## SDK Agents

> Provider-backed AI SDK agent wrappers for Anthropic and OpenAI that work like first-class Smithers agents.
> Source: https://smithers.sh/integrations/sdk-agents

`AnthropicAgent` and `OpenAIAgent` are thin wrappers around the [AI SDK](https://ai-sdk.dev) `ToolLoopAgent` with class-style ergonomics matching the [CLI agents](/integrations/cli-agents).

## Import

```ts
import {
  AnthropicAgent,
  OpenAIAgent,
  tools,
} from "smithers-orchestrator";
import { stepCountIs } from "ai";
```

## Quick Start

```ts
const claude = new AnthropicAgent({
  model: "claude-opus-4-6",
  tools,
  instructions: "You are a careful planner.",
  stopWhen: stepCountIs(40),
});

const codex = new OpenAIAgent({
  model: "gpt-5.3-codex",
  tools,
  instructions: "You are a precise implementation agent.",
  stopWhen: stepCountIs(40),
});
```

```tsx
{/* outputs comes from createSmithers() */}
<Task id="plan" output={outputs.plan} agent={claude}>
  Analyze the repository and propose a migration plan.
</Task>
```

## Model Input

Both classes accept a model ID string (`"claude-opus-4-6"`, `"gpt-5.3-codex"`) or a prebuilt AI SDK language model instance.

## Options

Constructors forward standard AI SDK `ToolLoopAgent` settings:

- `instructions`
- `tools`
- `stopWhen`
- `maxOutputTokens`
- `temperature`
- `providerOptions`
- `prepareCall`

The only addition is `model`: the wrapper resolves model-ID strings automatically.

## Hijack Support

SDK agents do not reopen a provider-native CLI. Smithers persists the agent conversation and reopens it through a Smithers-managed REPL via `smithers hijack <runId>`.

Live-run behavior:

- Smithers captures response history after each step via `onStepFinish`.
- `smithers hijack` waits until history is durable, cancels the live run, and opens the REPL.
- On clean REPL exit, Smithers writes updated message history back and resumes the workflow automatically.

Limits:

- Conversation hijack stays on the same agent implementation. Cross-engine hijack is not supported.
- Smithers reconstructs the original task agent from the workflow source.

## CLI vs SDK

| | CLI Agents | SDK Agents |
|---|---|---|
| Billing | Provider subscription / local CLI | API billing |
| Tools | Provider CLI tool ecosystem | Smithers [tools](/integrations/tools) [sandbox](/components/sandbox) |
| Flexibility | Native CLI flags | AI SDK `providerOptions` |

Pass a raw `ToolLoopAgent` directly if you prefer. The wrappers are convenience, not a separate runtime.

## Example: Dual Setup

```ts
const useCli = process.env.USE_CLI_AGENTS === "1";

export const claude = useCli
  ? new ClaudeCodeAgent({
      model: "claude-opus-4-6",
      dangerouslySkipPermissions: true,
    })
  : new AnthropicAgent({
      model: "claude-opus-4-6",
      tools,
      instructions: "You are a careful planner.",
      stopWhen: stepCountIs(40),
    });
```

## Next Steps

- [CLI Agents](/integrations/cli-agents)
- [Built-in Tools](/integrations/tools)
- [Agents and Tools](/concepts/agents-and-tools)

---

## HTTP Server

> Expose Smithers workflows over HTTP with a built-in server supporting run management, SSE event streaming, and human-in-the-loop approvals.
> Source: https://smithers.sh/integrations/server

Multi-workflow HTTP server exposing workflows via REST. Supports run management, SSE event streaming, and [human-in-the-loop approvals](/concepts/human-in-the-loop).

For a lighter single-workflow server that runs alongside `smithers up`, see [Serve Mode](/integrations/serve).

## Import

```ts
import { startServer } from "smithers-orchestrator";
```

## Quick Start

```ts
import { startServer } from "smithers-orchestrator";
import { drizzle } from "drizzle-orm/bun-sqlite";

const db = drizzle("./smithers.db");

const server = startServer({
  port: 7331,
  db,
  authToken: process.env.SMITHERS_API_KEY,
  rootDir: process.cwd(),
  allowNetwork: false,
});
```

## ServerOptions

```ts
type ServerOptions = {
  port?: number;
  db?: BunSQLiteDatabase<any>;
  authToken?: string;
  maxBodyBytes?: number;
  rootDir?: string;
  allowNetwork?: boolean;
};
```

| Option | Type | Default | Description |
|---|---|---|---|
| `port` | `number` | `7331` | TCP port |
| `db` | `BunSQLiteDatabase` | `undefined` | SQLite database for mirroring run/event data; enables `GET /v1/runs` |
| `authToken` | `string` | `process.env.SMITHERS_API_KEY` | Bearer token. Falls back to env var. Disabled if neither is set. |
| `maxBodyBytes` | `number` | `1048576` (1MB) | Max request body size. Returns 413 if exceeded. |
| `rootDir` | `string` | `undefined` | Root for workflow path resolution and tool sandboxing |
| `allowNetwork` | `boolean` | `false` | Allow network access in [`bash`](/integrations/tools#bash) |

Returns an `http.Server` instance, already listening.

### Effect API

`startServerEffect` returns an `Effect` wrapping the server startup for use inside Effect-based applications.

```ts
import { startServerEffect } from "smithers-orchestrator";
import { Effect } from "effect";

const program = startServerEffect({ port: 7331, db, rootDir: process.cwd() }).pipe(
  Effect.tap((server) => Effect.logInfo(`Server listening on port ${server.address()}`)),
);
```

---

## Authentication

When `authToken` is configured, every request must include:

- `Authorization: Bearer <token>`, or
- `x-smithers-key: <token>`

Missing/invalid tokens receive `401`.

---

## Observability

The server participates in the standard observability pipeline:

- `smithers.http.requests` counter
- `smithers.http.request_duration_ms` histogram
- Request handling, workflow loading, and body parsing wrapped in spans
- Prometheus scrape endpoint at `/metrics`

OTLP export:

```bash
export SMITHERS_OTEL_ENABLED=1
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
export OTEL_SERVICE_NAME=smithers-server
```

Local collector stack:

```bash
docker compose -f observability/docker-compose.otel.yml up
```

See [Observability](/guides/monitoring-logs) for the full metrics list.

---

## API Routes

All routes use JSON request/response bodies except `GET /v1/runs/:runId/events` (SSE) and `GET /metrics` (Prometheus text).

JSON responses include `Content-Type: application/json`, `Cache-Control: no-store`, and `X-Content-Type-Options: nosniff`.

### GET /metrics

Prometheus text exposition of runtime metrics.

```txt
# TYPE smithers_http_requests counter
smithers_http_requests 12
```

```yaml
scrape_configs:
  - job_name: smithers
    static_configs:
      - targets: ["localhost:7331"]
```

### POST /v1/runs

Start a new workflow run or resume an existing one.

```ts
{
  workflowPath: string;          // .tsx workflow file (required)
  input?: Record<string, any>;   // Workflow input (default: {})
  runId?: string;                // Custom run ID (default: auto-generated)
  resume?: boolean;              // Resume existing run (default: false)
  config?: {
    maxConcurrency?: number;
  };
}
```

Response: `{ "runId": "smi_abc123" }`

The workflow is dynamically imported, tables are auto-created, and the run starts asynchronously.

| Status | Code | Condition |
|---|---|---|
| 400 | `INVALID_REQUEST` | Missing/invalid `workflowPath`, `input`, or `config` |
| 400 | `RUN_ID_REQUIRED` | `resume: true` without `runId` |
| 400 | `WORKFLOW_PATH_OUTSIDE_ROOT` | Path resolves outside `rootDir` |
| 404 | `RUN_NOT_FOUND` | `resume: true` but run does not exist |
| 409 | `RUN_IN_PROGRESS` | Run with this ID already active |
| 409 | `RUN_ALREADY_EXISTS` | Run with this ID exists (no `resume`) |

### POST /v1/runs/:runId/resume

Resume a paused or failed run.

```ts
{
  workflowPath: string;
  input?: Record<string, any>;
  config?: { maxConcurrency?: number };
}
```

Response: `{ "runId": "smi_abc123" }`

If currently active, returns `200` with current status. Otherwise reloads the workflow and resumes from last checkpoint.

| Status | Code | Condition |
|---|---|---|
| 400 | `INVALID_REQUEST` | Missing/invalid `workflowPath` |
| 404 | `RUN_NOT_FOUND` | Run does not exist |

### POST /v1/runs/:runId/cancel

Cancel a running workflow. Signals the run's `AbortController`.

Response: `{ "runId": "smi_abc123" }`

| Status | Code | Condition |
|---|---|---|
| 404 | `NOT_FOUND` | Run not in active runs |

### GET /v1/runs/:runId

Run status and summary.

```json
{
  "runId": "smi_abc123",
  "workflowName": "bugfix",
  "status": "running",
  "startedAtMs": 1707500000000,
  "finishedAtMs": null,
  "summary": { "finished": 3, "in-progress": 1, "pending": 2 }
}
```

| Field | Type | Description |
|---|---|---|
| `status` | `string` | `running`, `waiting-approval`, `finished`, `failed`, `cancelled` |
| `startedAtMs` | `number \| null` | Start timestamp (ms) |
| `finishedAtMs` | `number \| null` | Finish timestamp (ms) |
| `summary` | `object` | Node count by state |

### GET /v1/runs/:runId/events

SSE stream of lifecycle events.

| Parameter | Type | Default | Description |
|---|---|---|---|
| `afterSeq` | `number` | `-1` | Only events after this sequence number |

```
retry: 1000

event: smithers
data: {"type":"RunStarted","runId":"smi_abc123","timestampMs":1707500000000}

event: smithers
data: {"type":"NodeStarted","runId":"smi_abc123","nodeId":"analyze","iteration":0,"attempt":0,"timestampMs":1707500001000}

: keep-alive

event: smithers
data: {"type":"NodeFinished","runId":"smi_abc123","nodeId":"analyze","iteration":0,"attempt":0,"timestampMs":1707500010000}
```

- Events named `smithers` with JSON payloads matching [`SmithersEvent`](/runtime/events).
- Polls database every 500ms.
- Keep-alive comment every 10s.
- Closes on terminal state (`finished`, `failed`, `cancelled`).
- Reconnect with `afterSeq` to resume.

### GET /v1/runs/:runId/frames

List render frames.

| Parameter | Type | Default | Description |
|---|---|---|---|
| `limit` | `number` | `50` | Max frames |
| `afterFrameNo` | `number` | `undefined` | Frames after this number |

### POST /v1/runs/:runId/nodes/:nodeId/approve

Approve a node waiting for [human approval](/concepts/approvals).

```ts
{
  iteration?: number;     // Default: 0
  note?: string;
  decidedBy?: string;
}
```

### POST /v1/runs/:runId/nodes/:nodeId/deny

Deny a node waiting for [human approval](/concepts/approvals).

```ts
{
  iteration?: number;     // Default: 0
  note?: string;
  decidedBy?: string;
}
```

### GET /v1/runs

List all runs. Requires server-level `db`.

| Parameter | Type | Default | Description |
|---|---|---|---|
| `limit` | `number` | `50` | Max runs |
| `status` | `string` | `undefined` | Filter by status |

Returns 400 `DB_NOT_CONFIGURED` if no database was provided.

### GET /v1/approvals

List all pending [approvals](/concepts/approvals) across runs. Requires server-level `db`.

```json
{
  "approvals": [
    {
      "runId": "smi_abc123",
      "nodeId": "deploy",
      "iteration": 0,
      "workflowName": "bugfix",
      "runStatus": "waiting-approval",
      "label": "deploy",
      "requestTitle": "deploy",
      "requestSummary": null,
      "requestedAtMs": 1707500100000,
      "waitingMs": 45000,
      "note": null,
      "decidedBy": null
    }
  ]
}
```

Results are sorted by `requestedAtMs` ascending (oldest first). Returns 400 `DB_NOT_CONFIGURED` if no database was provided.

Also accessible at legacy paths: `GET /v1/approval/list`, `GET /approval/list`, `GET /approvals`.

### POST /v1/runs/:runId/signals/:signalName

Deliver a named signal to a running workflow.

```ts
{
  data?: Record<string, any>;    // Signal payload (default: {})
  correlationId?: string;        // Optional correlation ID
  receivedBy?: string;           // Optional actor name
}
```

Response: `{ "delivered": true }` (or `false` if the run has no listener for that signal name).

| Status | Code | Condition |
|---|---|---|
| 404 | `NOT_FOUND` | Run not found |

Also accessible at legacy path: `POST /signal/:runId/:signalName`.

### GET /health

Liveness probe. Returns `200 OK` with a JSON body when the server is up.

```bash
curl http://localhost:7331/health
# {"ok":true}
```

No authentication is required for this endpoint — it is exempt from `authToken` checks.

---

## Error Response Format

```json
{
  "error": {
    "code": "ERROR_CODE",
    "message": "Human-readable description",
    "details": {}
  }
}
```

Unhandled errors return 500 with code `SERVER_ERROR`.

---

## Hot Reload

Each `POST /v1/runs` and `POST /v1/runs/:runId/resume` request performs a fresh load of the workflow file. The server hashes the source, writes a content-addressed shadow copy, and imports it via a unique URL. This means the running file on disk can be updated between requests without restarting the server — each new run picks up the latest version automatically.

The shadow file is named `.${workflowName}.smithers-${sha1hash}.tsx` and lives next to the original. It is safe to delete these files after runs complete.

## Run Heartbeat Tracking

Active runs write a heartbeat timestamp to `_smithers_runs.heartbeat_at_ms` every 5 seconds. The server uses this to distinguish truly running workflows from stale rows left by a previous process crash.

`isRunHeartbeatFresh` returns `false` if the heartbeat is more than 5 seconds old. In that case:

- `POST /v1/runs` with `resume: true` will resume rather than reject the run as already active.
- `POST /v1/runs/:runId/resume` will resume rather than return `200 { status: "running" }`.
- `POST /v1/runs/:runId/cancel` returns `409 RUN_NOT_ACTIVE` for stale runs instead of aborting.

On server shutdown, the `close` event fires the abort signal for all active run `AbortController`s, giving workflows a chance to checkpoint before exit.

---

## Database Mirroring

When a server-level `db` is provided and the workflow uses a different database, run metadata and events are mirrored asynchronously to the server database. This enables cross-workflow listing via `GET /v1/runs`.

---

## Example

```ts
import { startServer } from "smithers-orchestrator";
import { drizzle } from "drizzle-orm/bun-sqlite";

const db = drizzle("./server.db");

const server = startServer({
  port: 7331,
  db,
  authToken: "sk-my-secret-token",
  rootDir: "/home/workflows",
  maxBodyBytes: 2 * 1024 * 1024,
  allowNetwork: false,
});
```

```bash
# Start a run
curl -X POST http://localhost:7331/v1/runs \
  -H "Authorization: Bearer sk-my-secret-token" \
  -H "Content-Type: application/json" \
  -d '{"workflowPath": "./bugfix.tsx", "input": {"description": "Fix the auth token expiry bug"}}'

# Stream events
curl -N http://localhost:7331/v1/runs/smi_abc123/events \
  -H "Authorization: Bearer sk-my-secret-token"

# Approve a node
curl -X POST http://localhost:7331/v1/runs/smi_abc123/nodes/deploy/approve \
  -H "Authorization: Bearer sk-my-secret-token" \
  -H "Content-Type: application/json" \
  -d '{"note": "Looks good", "decidedBy": "alice"}'
```

## Next Steps

- [Serve Mode](/integrations/serve)
- [Gateway](/integrations/gateway)
- [Runtime Events](/runtime/events)
- [Human in the Loop](/concepts/human-in-the-loop)

---

## MCP Server

> Expose Smithers as a Model Context Protocol stdio server so any MCP client — Claude Code, Cursor, Codex, or your own agent — can list, run, inspect, and control workflows without shell scripting.
> Source: https://smithers.sh/integrations/mcp-server

Smithers ships a built-in MCP stdio server. When you pass `--mcp` to the CLI it speaks the Model Context Protocol over stdin/stdout instead of acting as an interactive CLI. Any MCP-aware client can connect, discover your workflows, start runs, watch progress, resolve approvals, and revert bad attempts — all through structured, machine-readable tool calls.

Use the MCP server when you want an AI agent to drive Smithers autonomously. Use the [HTTP Server](/integrations/server) when you need REST endpoints for human-written code or webhooks.

---

## Setup

### Start the server

```bash
smithers --mcp
```

By default this starts the semantic surface — a stable, structured set of tools designed for AI agent consumption. The semantic surface is what this page documents.

Two additional surfaces are available via `--surface`:

```bash
# Semantic tools only (default)
smithers --mcp --surface semantic

# Raw CLI-mirroring tools only
smithers --mcp --surface raw

# Both surfaces registered on the same server
smithers --mcp --surface both
```

Use `--surface raw` only when you need direct CLI parity. The semantic surface is strongly preferred for new integrations: every tool returns a consistent `{ ok, data, error }` envelope and uses validated Zod schemas for both input and output.

### Register with Claude Code

```bash
smithers mcp add
```

`smithers mcp add` writes the server entry to the appropriate MCP config file for the detected agent. Pass `--agent` to target a specific client, `--no-global` to install project-locally, or `--command` to override the launch command:

```bash
smithers mcp add --agent claude-code
smithers mcp add --no-global
smithers mcp add --command "pnpm smithers --mcp"
```

### Register manually

For clients that read a JSON config directly, add an entry like this:

```json
{
  "mcpServers": {
    "smithers": {
      "command": "smithers",
      "args": ["--mcp"]
    }
  }
}
```

For project-scoped installs (e.g. a monorepo where Smithers is a dev dependency):

```json
{
  "mcpServers": {
    "smithers": {
      "command": "pnpm",
      "args": ["smithers", "--mcp"]
    }
  }
}
```

---

## Tool Registration

When the server starts it calls `registerSemanticTools`, which loops over the tool definitions produced by `createSemanticToolDefinitions` and registers each one via `server.registerTool`. Every tool carries:

- **`inputSchema`** — a Zod object schema describing accepted parameters.
- **`outputSchema`** — a Zod schema for the structured response envelope.
- **`annotations`** — MCP annotation metadata (`readOnlyHint`, `destructiveHint`, `idempotentHint`, `openWorldHint`).

### Structured tool envelope

Every tool returns the same top-level shape:

```ts
{
  ok: boolean;
  data?: { ... };     // present on success
  error?: {           // present on failure
    code: string;
    message: string;
    details?: Record<string, unknown> | null;
    docsUrl?: string | null;
  };
}
```

The response is also echoed as a `text` content block so clients that do not parse `structuredContent` still receive the JSON payload.

### Tool annotations

| Annotation | Tools | Meaning |
|---|---|---|
| `readOnlyHint: true` | Most query tools | Tool does not modify state |
| `readOnlyHint: false, openWorldHint: true` | `run_workflow` | Launches external processes |
| `readOnlyHint: false, destructiveHint: true, idempotentHint: false` | `resolve_approval`, `revert_attempt` | Mutates persisted state irreversibly |

---

## Tool Reference

### list_workflows

List all Smithers workflows discovered in the working directory.

**Input:** none

**Output:**

```ts
{
  workflows: Array<{
    id: string;
    displayName: string;
    entryFile: string;
    sourceType: "seeded" | "user" | "generated";
  }>;
}
```

Use the returned `id` values as the `workflowId` parameter for `run_workflow`.

---

### run_workflow

Start or resume a discovered workflow.

**Input:**

| Parameter | Type | Default | Description |
|---|---|---|---|
| `workflowId` | `string` | required | Workflow ID from `list_workflows` |
| `input` | `Record<string, unknown>` | `{}` | Workflow input object |
| `prompt` | `string` | — | Shorthand: sets `input.prompt` when `input` is not provided |
| `runId` | `string` | auto | Custom run ID |
| `resume` | `boolean` | `false` | Resume an existing run; requires `runId` |
| `force` | `boolean` | `false` | Force-start even if a run with this ID already exists |
| `waitForTerminal` | `boolean` | `false` | Block until the run reaches a terminal state |
| `waitForStartMs` | `number` | `1000` | For background launches, how long to wait for the run row to appear in the database |
| `maxConcurrency` | `number` | — | Max concurrent nodes |
| `rootDir` | `string` | — | Root directory for tool sandboxing and path resolution |
| `logDir` | `string` | — | Directory for log files |
| `allowNetwork` | `boolean` | `false` | Allow network access in `bash` tool |
| `maxOutputBytes` | `number` | — | Cap on node output size |
| `toolTimeoutMs` | `number` | — | Per-tool call timeout |
| `hot` | `boolean` | `false` | Enable hot-reloading of the workflow file |

**Output:**

```ts
{
  workflow: { id, displayName, entryFile, sourceType };
  runId: string;
  launchMode: "background" | "waited";
  requestedResume: boolean;
  status: string;
  observedRun: RunSummary | null;
  result: { runId, status, output?, error? } | null;
}
```

**Background vs. waited launch**

By default (`waitForTerminal: false`) the tool fires the workflow and returns immediately with `launchMode: "background"`. The `observedRun` field reflects the run state polled during `waitForStartMs`. Use `watch_run` to track progress after launch.

Set `waitForTerminal: true` to block until the workflow finishes. The `result` field is populated and `launchMode` is `"waited"`.

**Run option forwarding**

`rootDir`, `logDir`, `allowNetwork`, `maxOutputBytes`, `toolTimeoutMs`, and `hot` are forwarded verbatim to the engine's `runWorkflow` call. They override any values baked into the workflow file.

---

### list_runs

List recent runs with summary data.

**Input:**

| Parameter | Type | Default | Description |
|---|---|---|---|
| `limit` | `number` (1–200) | `20` | Max runs to return |
| `status` | `string` | — | Filter by status (`running`, `finished`, `failed`, etc.) |

**Output:**

```ts
{
  runs: RunSummary[];
}
```

`RunSummary` fields: `runId`, `workflowName`, `workflowPath`, `parentRunId`, `status`, `createdAtMs`, `startedAtMs`, `finishedAtMs`, `heartbeatAtMs`, `activeNodeId`, `activeNodeLabel`, `pendingApprovalCount`, `waitingTimers`, `countsByState`.

---

### get_run

Get the full detail record for a specific run, including steps, approvals, timers, loop state, lineage, config, and error.

**Input:**

| Parameter | Type | Description |
|---|---|---|
| `runId` | `string` | Run ID |

**Output:**

```ts
{
  run: RunSummary & {
    steps: Array<{ nodeId, iteration, state, lastAttempt, updatedAtMs, outputTable, label }>;
    approvals: PendingApproval[];
    loops: Array<{ loopId, iteration, maxIterations }>;
    continuedFromRunIds: string[];
    activeDescendantRunId: string | null;
    config: unknown | null;
    error: unknown | null;
  };
}
```

---

### watch_run

Poll a run at a fixed interval until it reaches a terminal state or a timeout expires.

**Input:**

| Parameter | Type | Default | Description |
|---|---|---|---|
| `runId` | `string` | required | Run to watch |
| `intervalMs` | `number` | `1000` | Poll interval (minimum enforced by runtime) |
| `timeoutMs` | `number` | `30000` | Wall-clock budget before giving up |

**Output:**

```ts
{
  runId: string;
  intervalMs: number;
  pollCount: number;
  reachedTerminal: boolean;
  timedOut: boolean;
  finalRun: RunSummary;
  snapshots: Array<{ observedAtMs: number; run: RunSummary }>;
}
```

When `timedOut` is `true` the run is still active; call `watch_run` again or increase `timeoutMs`. Terminal statuses are `finished`, `failed`, and `cancelled`.

---

### explain_run

Return a structured diagnosis explaining why a run is blocked, waiting, or stale.

**Input:**

| Parameter | Type | Description |
|---|---|---|
| `runId` | `string` | Run ID |

**Output:**

```ts
{
  diagnosis: {
    runId: string;
    status: string;
    summary: string;
    generatedAtMs: number;
    blockers: Array<{
      kind: string;
      nodeId: string;
      iteration: number | null;
      reason: string;
      waitingSince: number;
      unblocker: string;
      context?: string;
      signalName?: string | null;
      dependencyNodeId?: string | null;
      firesAtMs?: number | null;
      remainingMs?: number | null;
      attempt?: number | null;
      maxAttempts?: number | null;
    }>;
    currentNodeId: string | null;
  };
}
```

The `summary` field is a human-readable sentence. `blockers` lists every node currently preventing progress, with `unblocker` describing what action or event would unblock it.

---

### list_pending_approvals

List approvals that are waiting for a human decision, optionally filtered by run, workflow, or node.

**Input:**

| Parameter | Type | Description |
|---|---|---|
| `runId` | `string` | Filter by run ID |
| `workflowName` | `string` | Filter by workflow name |
| `nodeId` | `string` | Filter by node ID |

All parameters are optional. Omit all to list every pending approval across all runs.

**Output:**

```ts
{
  approvals: Array<{
    runId: string;
    nodeId: string;
    iteration: number;
    status: string;
    requestedAtMs: number | null;
    decidedAtMs: number | null;
    note: string | null;
    decidedBy: string | null;
    request: unknown;
    decision: unknown;
    autoApproved?: boolean;
    workflowName: string | null;
    runStatus: string | null;
    nodeLabel: string | null;
  }>;
}
```

---

### resolve_approval

Approve or deny a pending approval. This tool is destructive and non-idempotent.

**Input:**

| Parameter | Type | Description |
|---|---|---|
| `action` | `"approve" \| "deny"` | required — decision to record |
| `runId` | `string` | Filter to a specific run |
| `workflowName` | `string` | Filter by workflow name |
| `nodeId` | `string` | Filter by node ID |
| `iteration` | `number` | Filter by loop iteration |
| `note` | `string` | Optional note to record with the decision |
| `decidedBy` | `string` | Identity of the decision-maker |
| `decision` | `unknown` | Structured decision payload passed back to the workflow |

**Ambiguity guard**

If the filters match zero approvals the tool errors with `INVALID_INPUT`. If the filters match more than one approval the tool errors with `INVALID_INPUT` and returns the list of matches in `details.matches` — add `runId`, `nodeId`, or `iteration` to narrow the selection. The tool never guesses when multiple approvals match.

**Output:**

```ts
{
  action: "approve" | "deny";
  approval: PendingApproval;   // with updated status, decidedAtMs, note, decidedBy
  run: RunSummary | null;
}
```

---

### get_node_detail

Get enriched detail for a single node, including all attempts, tool calls, token usage, scorer results, and validated output.

**Input:**

| Parameter | Type | Description |
|---|---|---|
| `runId` | `string` | required |
| `nodeId` | `string` | required |
| `iteration` | `number` | Loop iteration (default: latest) |

**Output:**

```ts
{
  detail: {
    node: { runId, nodeId, iteration, state, lastAttempt, updatedAtMs, outputTable, label };
    status: string;
    durationMs: number | null;
    attemptsSummary: { total, failed, cancelled, succeeded, waiting };
    attempts: unknown[];
    toolCalls: unknown[];
    tokenUsage: unknown;
    scorers: unknown[];
    output: {
      validated: unknown | null;
      raw: unknown | null;
      source: "cache" | "output-table" | "none";
      cacheKey: string | null;
    };
    limits: {
      toolPayloadBytesHuman: number;
      validatedOutputBytesHuman: number;
    };
  };
}
```

---

### revert_attempt

Revert the workspace and frame history back to the state captured at a specific attempt. This is destructive and non-idempotent.

**Input:**

| Parameter | Type | Default | Description |
|---|---|---|---|
| `runId` | `string` | required | Run containing the node |
| `nodeId` | `string` | required | Node to revert |
| `iteration` | `number` | `0` | Loop iteration |
| `attempt` | `number` | required | Attempt number to revert to (must be ≥ 1) |

**Output:**

```ts
{
  runId: string;
  nodeId: string;
  iteration: number;
  attempt: number;
  success: boolean;
  error?: string;
  jjPointer?: string;
  run: RunSummary | null;
}
```

---

### list_artifacts

List structured output artifacts produced by nodes in a run.

**Input:**

| Parameter | Type | Default | Description |
|---|---|---|---|
| `runId` | `string` | required | Run ID |
| `nodeId` | `string` | — | Limit to a specific node |
| `includeRaw` | `boolean` | `false` | Include raw (pre-validation) output values |

**Output:**

```ts
{
  artifacts: Array<{
    artifactId: string;   // "<runId>:<nodeId>:<iteration>"
    kind: "node-output";
    runId: string;
    nodeId: string;
    iteration: number;
    label: string | null;
    state: string;
    outputTable: string | null;
    source: "cache" | "output-table" | "none";
    cacheKey: string | null;
    value: unknown | null;
    rawValue?: unknown | null;   // only when includeRaw=true
  }>;
}
```

Only nodes that have an `outputTable` and a non-`none` output source are included.

---

### get_chat_transcript

Return the structured agent chat transcript for a run, grouped by attempts.

**Input:**

| Parameter | Type | Default | Description |
|---|---|---|---|
| `runId` | `string` | required | Run ID |
| `all` | `boolean` | `false` | Include all attempts, not just those with known output events |
| `includeStderr` | `boolean` | `true` | Include stderr messages |
| `tail` | `number` | — | Return only the last N messages |

**Output:**

```ts
{
  runId: string;
  attempts: Array<{
    attemptKey: string;
    nodeId: string;
    iteration: number;
    attempt: number;
    state: string;
    startedAtMs: number;
    finishedAtMs: number | null;
    cached: boolean;
    meta: unknown | null;
  }>;
  messages: Array<{
    id: string;
    attemptKey: string;
    nodeId: string;
    iteration: number;
    attempt: number;
    role: "user" | "assistant" | "stderr";
    stream: "stdout" | "stderr" | null;
    timestampMs: number;
    text: string;
    source: "prompt" | "event" | "responseText";
  }>;
}
```

Messages are sorted by `timestampMs`. Use `tail` to limit context window usage when transcripts are long.

---

### get_run_events

Return the raw structured event history for a run with optional filtering.

**Input:**

| Parameter | Type | Default | Description |
|---|---|---|---|
| `runId` | `string` | required | Run ID |
| `afterSeq` | `number` | — | Only events with `seq` greater than this value |
| `limit` | `number` (1–10000) | `200` | Max events to return |
| `nodeId` | `string` | — | Filter to events for a specific node |
| `types` | `string[]` | — | Filter to specific event types (e.g. `["NodeFinished", "NodeFailed"]`) |
| `sinceTimestampMs` | `number` | — | Only events at or after this timestamp |

**Output:**

```ts
{
  runId: string;
  events: Array<{
    runId: string;
    seq: number;
    timestampMs: number;
    type: string;
    payload: unknown | null;
  }>;
}
```

Paginate using `afterSeq`: pass the `seq` of the last received event to fetch the next page.

---

## Usage Examples

### List workflows and start a run

```
> list_workflows {}

{
  "ok": true,
  "data": {
    "workflows": [
      { "id": "bugfix", "displayName": "bugfix", "entryFile": "./workflows/bugfix.tsx", "sourceType": "user" }
    ]
  }
}

> run_workflow { "workflowId": "bugfix", "prompt": "Fix the auth token expiry bug" }

{
  "ok": true,
  "data": {
    "runId": "smi_abc123",
    "launchMode": "background",
    "status": "running",
    ...
  }
}
```

### Watch until complete

```
> watch_run { "runId": "smi_abc123", "timeoutMs": 120000 }

{
  "ok": true,
  "data": {
    "reachedTerminal": true,
    "timedOut": false,
    "finalRun": { "status": "waiting-approval", ... }
  }
}
```

### Resolve a pending approval

```
> list_pending_approvals { "runId": "smi_abc123" }

{
  "ok": true,
  "data": {
    "approvals": [
      { "nodeId": "deploy", "iteration": 0, "nodeLabel": "Deploy to production", ... }
    ]
  }
}

> resolve_approval { "action": "approve", "runId": "smi_abc123", "nodeId": "deploy", "decidedBy": "alice", "note": "Looks good" }

{
  "ok": true,
  "data": {
    "action": "approve",
    "approval": { "status": "approved", "decidedAtMs": 1707500100000, ... },
    "run": { "status": "running", ... }
  }
}
```

### Debug a blocked run

```
> explain_run { "runId": "smi_abc123" }

{
  "ok": true,
  "data": {
    "diagnosis": {
      "summary": "Run is waiting for a human approval on node 'deploy'.",
      "blockers": [
        {
          "kind": "approval",
          "nodeId": "deploy",
          "reason": "Node requires human approval before proceeding.",
          "unblocker": "Call resolve_approval with action=approve or action=deny."
        }
      ]
    }
  }
}
```

### Revert a failed attempt

```
> get_node_detail { "runId": "smi_abc123", "nodeId": "analyze" }

{
  "ok": true,
  "data": {
    "detail": {
      "attemptsSummary": { "total": 3, "failed": 2, "succeeded": 1 },
      ...
    }
  }
}

> revert_attempt { "runId": "smi_abc123", "nodeId": "analyze", "attempt": 1 }

{
  "ok": true,
  "data": {
    "success": true,
    "run": { "status": "running", ... }
  }
}
```

---

## Error Codes

All errors follow the structured envelope. Common codes:

| Code | Meaning |
|---|---|
| `RUN_NOT_FOUND` | No run exists with the given ID |
| `INVALID_INPUT` | Missing required field, failed validation, or ambiguous approval filter |
| `WORKFLOW_MISSING_DEFAULT` | Workflow file has no default export |
| `WORKFLOW_NOT_FOUND` | No workflow matches the given ID |

---

## Serve Mode

> Run a single workflow as an HTTP server with Hono — interact with it over REST, stream events via SSE, and manage approvals remotely.
> Source: https://smithers.sh/integrations/serve

Serve mode starts a Hono-based HTTP server alongside a running workflow. Every route operates on the single active run — no workflow path or run ID needed in requests.

## CLI

```bash
smithers up workflow.tsx --serve --port 3000 --host 0.0.0.0
```

| Flag | Default | Description |
|---|---|---|
| `--serve` | `false` | Enable HTTP server mode |
| `--port` | `7331` | TCP port |
| `--host` | `127.0.0.1` | Bind address |
| `--auth-token` | `SMITHERS_API_KEY` env | Bearer token for auth |
| `--metrics` | `true` | Expose `/metrics` Prometheus endpoint |

The process stays alive after the workflow completes so you can still query final state. Ctrl+C stops both the server and the workflow.

Detached mode works too:

```bash
smithers up workflow.tsx --serve --port 8080 -d
```

## Programmatic

```ts
import { createServeApp } from "smithers-orchestrator/serve";

const app = createServeApp({
  workflow,
  adapter,
  runId,
  abort: new AbortController(),
  authToken: "sk-secret",
});

Bun.serve({ port: 3000, fetch: app.fetch });
```

`createServeApp` returns a standard Hono app. Mount it with `Bun.serve`, pass it to another Hono app via `app.route()`, or use `app.fetch` directly in tests.

## ServeOptions

```ts
type ServeOptions = {
  workflow: SmithersWorkflow<any>;
  adapter: SmithersDb;
  runId: string;
  abort: AbortController;
  authToken?: string;
  metrics?: boolean;
};
```

| Option | Type | Description |
|---|---|---|
| `workflow` | `SmithersWorkflow` | Loaded workflow instance |
| `adapter` | `SmithersDb` | Database adapter for the workflow |
| `runId` | `string` | Active run ID |
| `abort` | `AbortController` | Shared abort controller for cancellation |
| `authToken` | `string` | Bearer token. Falls back to `SMITHERS_API_KEY`. Disabled if unset. |
| `metrics` | `boolean` | Expose `/metrics` endpoint. Default: `true`. |

---

## Authentication

When `authToken` is configured, every route except `/health` requires:

- `Authorization: Bearer <token>`, or
- `x-smithers-key: <token>`

Missing or invalid tokens receive `401`.

---

## Routes

### GET /health

Always returns `200` regardless of auth.

```json
{ "ok": true }
```

### GET /

Run status and node summary.

```json
{
  "runId": "run-1234",
  "workflowName": "bugfix",
  "status": "running",
  "startedAtMs": 1707500000000,
  "finishedAtMs": null,
  "summary": { "finished": 3, "in-progress": 1, "pending": 2 }
}
```

### GET /events

SSE stream of lifecycle events. Same format as the [multi-workflow server](/integrations/server#get-v1runsrunidevents).

| Parameter | Type | Default | Description |
|---|---|---|---|
| `afterSeq` | `number` | `-1` | Only events after this sequence |

```
event: smithers
data: {"type":"NodeStarted","runId":"run-1234","nodeId":"analyze","iteration":0,"attempt":0}
id: 1

event: smithers
data: {"type":"NodeFinished","runId":"run-1234","nodeId":"analyze","iteration":0,"attempt":0}
id: 2
```

- Polls every 500ms.
- Auto-closes when the run reaches a terminal state.
- Reconnect with `?afterSeq=N` to resume from a known position.

### GET /frames

Rendered workflow frames.

| Parameter | Type | Default | Description |
|---|---|---|---|
| `limit` | `number` | `50` | Max frames |
| `afterFrameNo` | `number` | — | Frames after this number |

### POST /approve/:nodeId

Approve a pending approval gate.

```json
{
  "iteration": 0,
  "note": "Looks good",
  "decidedBy": "alice"
}
```

All fields optional. Returns `{ "runId": "run-1234" }`.

### POST /deny/:nodeId

Deny a pending approval gate. Same body as `/approve/:nodeId`.

### POST /cancel

Cancel the running workflow.

| Status | Code | Condition |
|---|---|---|
| 200 | — | Cancelled successfully |
| 409 | `RUN_NOT_ACTIVE` | Run already finished/failed/cancelled |

### GET /metrics

Prometheus text exposition. Same metrics as the [multi-workflow server](/integrations/server#get-metrics).

---

## Error Format

```json
{
  "error": {
    "code": "ERROR_CODE",
    "message": "Human-readable description"
  }
}
```

Unknown routes return `404` with code `NOT_FOUND`.

---

## Serve Mode vs Multi-Workflow Server

| | Serve mode | [Multi-workflow server](/integrations/server) |
|---|---|---|
| Scope | Single workflow, single run | Any workflow, multiple concurrent runs |
| Start | `smithers up --serve` or `createServeApp()` | `startServer()` |
| Routes | `/`, `/events`, `/approve/:nodeId`, ... | `/v1/runs`, `/v1/runs/:runId`, ... |
| Framework | Hono | Node.js `http` |
| Use case | Development, single-purpose services | Production API gateway |

---

## Example

```bash
# Start a workflow with serve mode
smithers up workflow.tsx --serve --port 8080 --auth-token sk-secret

# Check status
curl http://localhost:8080/ -H "Authorization: Bearer sk-secret"

# Stream events
curl -N http://localhost:8080/events -H "Authorization: Bearer sk-secret"

# Approve a gate
curl -X POST http://localhost:8080/approve/deploy \
  -H "Authorization: Bearer sk-secret" \
  -H "Content-Type: application/json" \
  -d '{"note": "Ship it", "decidedBy": "alice"}'

# Health check (no auth needed)
curl http://localhost:8080/health
```

---

## Gateway

> Headless multi-workflow control plane for remote run management, approvals, signals, cron, WebSocket streaming, and HTTP RPC.
> Source: https://smithers.sh/integrations/gateway

`Gateway` is Smithers' headless control plane for remote workflow execution.

If `bunx smithers-orchestrator up` is for local runs and [`startServer()`](/integrations/server) is for loading workflow files over HTTP, `Gateway` is for long-lived remote control: bots, dashboards, webhook receivers, and schedulers connect once, authenticate, start runs, subscribe to progress, decide approvals, inject signals, and manage cron schedules.

That makes it a good fit for ClaudeBot/OpenClaw-style systems where Smithers is the orchestration engine sitting behind GitHub, Slack, CI, or an internal operations UI.

## What The Gateway Does

- Registers multiple named workflows behind one server
- Exposes a [WebSocket](https://developer.mozilla.org/en-US/docs/Web/API/WebSocket) RPC protocol with event streaming
- Exposes `POST /rpc` for stateless HTTP callers
- Enforces auth and scopes before a client can call methods
- Surfaces pending [approvals](/concepts/approvals) with rich metadata
- Delivers external [signals](/runtime/events) into waiting workflows
- Persists and triggers cron schedules
- Propagates gateway auth context into `ctx.auth`

## Import

```ts
import { Gateway } from "smithers-orchestrator";
```

## Quick Start

This example registers a [workflow](/components/workflow), exposes [approvals](/concepts/approvals) remotely, and starts the gateway on port `7331`.

```tsx
/** @jsxImportSource smithers-orchestrator */
import {
  Approval,
  Gateway,
  Sequence,
  Task,
  Workflow,
  approvalDecisionSchema,
  createSmithers,
} from "smithers-orchestrator";
import { z } from "zod";

const { smithers, outputs } = createSmithers({
  plan: z.object({
    summary: z.string(),
  }),
  approval: approvalDecisionSchema,
  deploy: z.object({
    shipped: z.boolean(),
    env: z.string(),
  }),
});

const deployWorkflow = smithers((ctx) => (
  <Workflow name="deploy">
    <Sequence>
      <Task id="plan" output={outputs.plan}>
        {{
          summary: `Deploy ${ctx.input.sha} to ${ctx.input.env}`,
        }}
      </Task>

      <Approval
        id="ship"
        output={outputs.approval}
        request={{
          title: `Deploy ${ctx.input.sha}?`,
          summary: "Remote operator approval required.",
        }}
        allowedScopes={["approve"]}
      />

      <Task id="deploy" output={outputs.deploy}>
        {{
          shipped: true,
          env: String(ctx.input.env),
        }}
      </Task>
    </Sequence>
  </Workflow>
));

const smithersGateway = new Gateway({
  heartbeatMs: 15_000,
  auth: {
    mode: "token",
    tokens: {
      "operator-token": {
        role: "operator",
        scopes: ["*"],
        userId: "user:ops",
      },
      "viewer-token": {
        role: "viewer",
        scopes: ["health", "runs.list", "runs.get", "approvals.list"],
        userId: "user:viewer",
      },
    },
  },
  defaults: {
    cliAgentTools: "explicit-only",
  },
});

smithersGateway.register("deploy", deployWorkflow, {
  schedule: "0 8 * * 1-5",
});

await smithersGateway.listen({ port: 7331 });
```

### What `GatewayOptions` Configures

```ts
type GatewayOptions = {
  protocol?: number;
  features?: string[];
  heartbeatMs?: number;
  auth?: GatewayAuthConfig;
  defaults?: {
    cliAgentTools?: "all" | "explicit-only";
  };
  maxBodyBytes?: number;
};
```

| Option | Default | Meaning |
| --- | --- | --- |
| `protocol` | `1` | Gateway protocol version negotiated during `connect` |
| `features` | `["streaming", "runs"]` | Feature list returned to clients in `hello` |
| `heartbeatMs` | `15000` | Interval for `tick` events and scheduler polling cap |
| `auth` | `undefined` | Auth mode and scope mapping |
| `defaults.cliAgentTools` | `undefined` | Default tool policy for [CLI agents](/integrations/cli-agents) started through the gateway |
| `maxBodyBytes` | `1048576` | Max JSON body size for `POST /rpc` |

### `ctx.auth` Inside Workflows

Runs started through the gateway receive auth metadata in `ctx.auth`:

```ts
{
  triggeredBy: string;
  role: string;
  scopes: string[];
  createdAt: string;
}
```

That lets you build [workflows](/concepts/workflows-overview) that behave differently for operators, bots, cron, or external services:

```tsx
<Task id="audit" output={outputs.audit}>
  {{
    triggeredBy: ctx.auth?.triggeredBy ?? "unknown",
    role: ctx.auth?.role ?? "unknown",
    scopes: ctx.auth?.scopes ?? [],
  }}
</Task>
```

## Authentication Modes

The gateway supports three auth modes. WebSocket clients authenticate during `connect`. HTTP clients authenticate with headers on each `POST /rpc`.

### Token Mode

Static tokens map directly to a role, scopes, and an optional user ID.

```ts
new Gateway({
  auth: {
    mode: "token",
    tokens: {
      "viewer-token": {
        role: "viewer",
        scopes: ["health", "runs.list", "runs.get", "approvals.list"],
      },
      "approver-token": {
        role: "approver",
        scopes: ["approve", "approvals.list", "runs.get"],
        userId: "user:oncall",
      },
      "operator-token": {
        role: "operator",
        scopes: ["*"],
        userId: "user:ops",
      },
    },
  },
});
```

Use token mode for internal services, quick prototypes, or webhook relays where static secrets are acceptable.

### JWT Mode

JWT mode verifies `HS256` tokens and extracts scopes, role, and user ID from claims.

```ts
new Gateway({
  auth: {
    mode: "jwt",
    issuer: "https://auth.example.com",
    audience: "smithers",
    secret: process.env.GATEWAY_JWT_SECRET!,
    scopesClaim: "permissions",
    roleClaim: "role",
    userClaim: "sub",
    defaultRole: "operator",
    defaultScopes: ["runs.get"],
    clockSkewSeconds: 60,
  },
});
```

JWT validation currently checks:

- `alg === "HS256"`
- HMAC signature
- `iss`
- `aud`
- `exp`
- `nbf`

If the configured scope claim is a string, it can be space- or comma-separated. Arrays work too.

### Trusted Proxy Mode

Trusted-proxy mode assumes something in front of the gateway already authenticated the user and injected identity headers.

```ts
new Gateway({
  auth: {
    mode: "trusted-proxy",
    allowedOrigins: ["https://ops.example.com"],
    trustedHeaders: ["x-user-id", "x-user-scopes", "x-user-role"],
    defaultRole: "operator",
    defaultScopes: ["runs.list", "runs.get"],
  },
});
```

If `trustedHeaders` is omitted, the gateway defaults to:

- `x-user-id`
- `x-user-scopes`
- `x-user-role`

Use this only when the gateway is behind something you control, such as Cloudflare Access, an internal API gateway, or an auth proxy that strips and rewrites those headers.

## WebSocket Protocol

The WebSocket endpoint is the same server root:

```txt
ws://localhost:7331
```

The protocol is JSON RPC-ish rather than raw SSE or REST. There are three frame types.

### Request Frames

```ts
type RequestFrame = {
  type: "req";
  id: string;
  method: string;
  params?: unknown;
};
```

### Response Frames

```ts
type ResponseFrame = {
  type: "res";
  id: string;
  ok: boolean;
  payload?: unknown;
  error?: {
    code: string;
    message: string;
  };
};
```

### Event Frames

```ts
type EventFrame = {
  type: "event";
  event: string;
  payload?: unknown;
  seq: number;
  stateVersion: number;
};
```

`seq` is per connection. `stateVersion` increments globally each time the gateway broadcasts a new event.

### Handshake

When a client connects, the server immediately sends a challenge event:

```json
{
  "type": "event",
  "event": "connect.challenge",
  "payload": {
    "nonce": "8d6d8e1a-...",
    "ts": 1765158412000
  },
  "seq": 1,
  "stateVersion": 0
}
```

The client then sends `connect`:

```json
{
  "type": "req",
  "id": "connect-1",
  "method": "connect",
  "params": {
    "minProtocol": 1,
    "maxProtocol": 1,
    "client": {
      "id": "github-bot",
      "version": "1.0.0",
      "platform": "node"
    },
    "auth": {
      "token": "operator-token"
    },
    "subscribe": ["run_123"]
  }
}
```

The gateway replies with a `hello` payload:

```json
{
  "type": "res",
  "id": "connect-1",
  "ok": true,
  "payload": {
    "protocol": 1,
    "features": ["streaming", "runs"],
    "policy": {
      "heartbeatMs": 15000
    },
    "auth": {
      "sessionToken": "e9a8b9d5-...",
      "role": "operator",
      "scopes": ["*"],
      "userId": "user:ops"
    },
    "snapshot": {
      "runs": [],
      "approvals": [],
      "stateVersion": 0
    }
  }
}
```

### Subscriptions

The optional `subscribe` array filters event delivery by `runId`.

- Omit it to receive events for every run
- Pass one or more run IDs to restrict delivery
- `runs.create`, `approvals.decide`, `signals.send`, and `cron.trigger` automatically add the affected run to the current connection's subscription set

That pattern is useful for bots that only care about the runs they started.

### Heartbeats

After a successful `connect`, the gateway emits `tick` events every `heartbeatMs`:

```json
{
  "type": "event",
  "event": "tick",
  "payload": {
    "ts": 1765158415000
  },
  "seq": 9,
  "stateVersion": 14
}
```

### Streamed Event Names

The gateway maps Smithers [runtime events](/runtime/events) into a stable external stream:

| Event | Payload highlights |
| --- | --- |
| `connect.challenge` | `nonce`, `ts` |
| `tick` | `ts` |
| `node.started` | `runId`, `nodeId`, `state: "in-progress"` |
| `node.finished` | `runId`, `nodeId`, `state: "finished"` |
| `node.failed` | `runId`, `nodeId`, `state: "failed"`, `error` |
| `task.output` | `runId`, `nodeId`, `output`, `stream` |
| `task.heartbeat` | `runId`, `nodeId`, `iteration`, `attempt` |
| `approval.requested` | `runId`, `nodeId`, `iteration` |
| `approval.decided` | `runId`, `nodeId`, `iteration`, `approved` |
| `approval.auto_approved` | `runId`, `nodeId`, `iteration` |
| `run.completed` | `runId`, `status`, optional `error` |
| `cron.triggered` | `cronId`, `workflow`, `runId` |

## HTTP Fallback: `POST /rpc`

Webhooks, GitHub Actions, Cloud Functions, and other stateless callers usually do not want to keep a WebSocket open. For those cases the gateway exposes an HTTP RPC endpoint:

```txt
POST /rpc
```

### Auth Headers

In token or JWT mode, HTTP callers authenticate with either:

- `Authorization: Bearer <token>`
- `x-smithers-key: <token>`

In trusted-proxy mode, the gateway reads the forwarded identity headers from the request instead.

### Request Shape

`POST /rpc` accepts the same logical request as the WebSocket protocol:

```json
{
  "id": "create-1",
  "method": "runs.create",
  "params": {
    "workflow": "deploy",
    "input": {
      "env": "staging",
      "sha": "abc123"
    }
  }
}
```

### Response Shape

Successful method calls return the same `ResponseFrame` envelope:

```json
{
  "type": "res",
  "id": "create-1",
  "ok": true,
  "payload": {
    "runId": "run_abc123",
    "workflow": "deploy"
  }
}
```

Errors return the same `ResponseFrame` body with an HTTP status code:

```json
{
  "type": "res",
  "id": "create-1",
  "ok": false,
  "error": {
    "code": "FORBIDDEN",
    "message": "Missing scope for runs.create"
  }
}
```

Use WebSockets when you need live events. Use `POST /rpc` when you only need request/response semantics.

### Example

```bash
curl -X POST http://localhost:7331/rpc \
  -H "Authorization: Bearer operator-token" \
  -H "Content-Type: application/json" \
  -d '{
    "id": "create-1",
    "method": "runs.create",
    "params": {
      "workflow": "deploy",
      "input": { "env": "staging", "sha": "abc123" }
    }
  }'
```

## RPC Methods Reference

The gateway authorizes methods by scope. Each method falls into one of four access levels:

- `read`
- `execute`
- `approve`
- `admin`

Higher levels imply lower ones. For example, a token with `approve` can call `read` and `execute` methods too.

### System

| Method | Access | Params | Returns |
| --- | --- | --- | --- |
| `health` | `read` | none | `protocol`, `features`, `stateVersion`, `uptimeMs` |

### Runs

| Method | Access | Params | Returns |
| --- | --- | --- | --- |
| `runs.list` | `read` | `limit?`, `status?` | Recent runs across registered workflows |
| `runs.create` | `execute` | `workflow`, `input?`, `runId?` | `{ runId, workflow }` |
| `runs.get` | `read` | `runId` | Run row plus node-state summary |
| `runs.diff` | `read` | `leftRunId`, `rightRunId` | Snapshot diff between two runs |
| `runs.cancel` | `execute` | `runId` | `{ runId, status: "cancelling" }` |
| `runs.rerun` | `execute` | `runId`, `newRunId?` | Starts a new run with the original input |

Notes:

- `runs.create` requires a registered workflow key, not a file path.
- `runs.get` returns `workflowKey` so clients know which registered workflow produced the run.
- `runs.diff` compares the latest saved snapshot for each run. If either run has no snapshot yet, the method returns `NOT_FOUND`.
- `runs.rerun` reloads the original input from the workflow's `input` table, then delegates to `runs.create`.

### Frames And Attempts

| Method | Access | Params | Returns |
| --- | --- | --- | --- |
| `frames.list` | `read` | `runId`, `limit?`, `afterFrameNo?` | Render frames |
| `frames.get` | `read` | `runId`, `frameNo?` | One frame, or the latest frame when `frameNo` is omitted |
| `attempts.list` | `read` | `runId`, `nodeId?`, `iteration?` | Attempts for a run or for one node iteration |
| `attempts.get` | `read` | `runId`, `nodeId`, `iteration`, `attempt` | One attempt row |

These methods are what you use to build a run inspector or debugger.

### Approvals

| Method | Access | Params | Returns |
| --- | --- | --- | --- |
| `approvals.list` | `read` | none | Pending approvals across all workflows |
| `approvals.decide` | `approve` | `runId`, `nodeId`, `approved`, `iteration?`, `note?`, `decision?` | `{ runId, nodeId, iteration, approved }` |

`approvals.list` returns richer metadata than just `runId` and `nodeId`. Each row includes:

- `requestTitle`
- `requestSummary`
- `approvalMode`
- `options`
- `allowedScopes`
- `allowedUsers`
- `autoApprove`

`approvals.decide` enforces gateway-level approval restrictions before it records a decision:

- If the approval specifies `allowedUsers`, the caller's `userId` must match
- If the approval specifies `allowedScopes`, the caller must have one of those scopes

For selection and ranking approvals, `decision` must match the requested mode:

```json
{
  "method": "approvals.decide",
  "params": {
    "runId": "run_123",
    "nodeId": "pick-plan",
    "iteration": 0,
    "approved": true,
    "decision": {
      "selected": "balanced",
      "notes": "best fit"
    }
  }
}
```

For ranking approvals:

```json
{
  "decision": {
    "ranked": ["canary", "regional", "global"],
    "notes": "lowest blast radius first"
  }
}
```

After recording the decision, the gateway attempts to resume the run if it was paused on that approval.

### Signals

| Method | Access | Params | Returns |
| --- | --- | --- | --- |
| `signals.send` | `execute` | `runId`, `signalName`, `data?`, `correlationId?` | Delivery metadata including `seq` and `receivedAtMs` |

Use `signals.send` to wake workflows blocked on [`<Signal>` or `<WaitForEvent>`](/runtime/events).

```json
{
  "method": "signals.send",
  "params": {
    "runId": "run_123",
    "signalName": "github.comment",
    "correlationId": "pr-42",
    "data": {
      "body": "@smithers re-run the review",
      "author": "octocat"
    }
  }
}
```

After delivering the signal, the gateway tries to resume the run if it was waiting.

### Cron

| Method | Access | Params | Returns |
| --- | --- | --- | --- |
| `cron.list` | `read` | none | Cron rows across all registered workflows |
| `cron.add` | `admin` | `workflow`, `pattern`, `cronId?`, `enabled?` | The created cron row |
| `cron.remove` | `admin` | `cronId` | `{ cronId, removed: true }` |
| `cron.trigger` | `execute` | `cronId` or `workflow`, `input?` | `{ runId, workflow }` |

`cron.trigger` is handy for "run this scheduled job right now" buttons in UIs or bots.

## Role And Scope Based Access Control

The gateway stores both a `role` string and a list of `scopes`.

- `role` is identity metadata and is passed into `ctx.auth.role`
- `scopes` are what the gateway actually enforces

Scopes can be granted in three styles:

- Access level keywords: `read`, `execute`, `approve`, `admin`
- Exact method names: `runs.create`, `approvals.decide`
- Wildcards: `runs.*`, `cron.*`, `*`

This token can read every run method and nothing else:

```ts
{
  role: "viewer",
  scopes: ["runs.*", "health"]
}
```

This token can approve gates and, because of access ranking, also call `read` and `execute` methods:

```ts
{
  role: "approver",
  scopes: ["approve"],
  userId: "user:oncall"
}
```

### Approval-Level Restrictions

You can add narrower restrictions at the workflow level:

```tsx
<Approval
  id="deploy-prod"
  output={outputs.approval}
  request={{ title: "Deploy to production?" }}
  allowedScopes={["approve"]}
  allowedUsers={["user:oncall", "user:release-manager"]}
/>
```

Even if a caller has general approval access, the gateway still checks those narrower constraints before accepting `approvals.decide`.

## Cron Triggers

There are two ways to get cron into the gateway.

### 1. Register A Workflow With A Schedule

```ts
gateway.register("nightly-report", reportWorkflow, {
  schedule: "0 2 * * *",
});
```

When the gateway starts listening, it writes or updates a cron row in the workflow database with:

- `cronId = "gateway:<workflowKey>"`
- the cron pattern
- the next scheduled fire time

### 2. Manage Crons At Runtime

You can create, delete, inspect, and trigger schedules with the `cron.*` RPC methods.

The gateway polls for due schedules on an interval derived from `heartbeatMs`, clamped between `1000ms` and `15000ms`.

When a cron fires:

- the gateway starts a run
- it sets `ctx.auth.triggeredBy` to `"cron:gateway"`
- it sets `ctx.auth.role` to `"system"`
- it grants `ctx.auth.scopes = ["*"]`
- it emits `cron.triggered`

## Building A GitHub Bot With The Gateway

The gateway does not include a GitHub webhook server. The usual architecture is:

1. GitHub sends a webhook to your webhook receiver
2. The receiver verifies the GitHub signature
3. The receiver calls `POST /rpc` or keeps a WebSocket open
4. The workflow does the heavy lifting

Minimal webhook relay:

```ts
import { Hono } from "hono";

const app = new Hono();

app.post("/github/webhooks", async (c) => {
  const event = c.req.header("x-github-event");
  const payload = await c.req.json();

  if (event === "pull_request" && payload.action === "opened") {
    await fetch("http://127.0.0.1:7331/rpc", {
      method: "POST",
      headers: {
        "content-type": "application/json",
        authorization: `Bearer ${process.env.GATEWAY_TOKEN}`,
      },
      body: JSON.stringify({
        method: "runs.create",
        params: {
          workflow: "github-pr-review",
          input: {
            owner: payload.repository.owner.login,
            repo: payload.repository.name,
            pullNumber: payload.pull_request.number,
            installationId: payload.installation?.id,
            sender: payload.sender.login,
          },
        },
      }),
    });
  }

  return c.json({ ok: true });
});
```

Use `signals.send` instead of `runs.create` when you want a long-lived run to wait for follow-up events such as review commands, check results, or a maintainer comment.

See [GitHub Bot](/integrations/github-bot) for a full setup.

## Building A Slack Bot With The Gateway

Slack follows the same pattern:

1. Slack slash command or Events API request hits your app
2. Your app verifies the Slack signature
3. Your app starts or resumes a workflow through the gateway
4. The workflow posts back to Slack using your Slack client or tools

Example slash command relay:

```ts
app.post("/slack/commands/review", async (c) => {
  const form = await c.req.formData();

  await fetch("http://127.0.0.1:7331/rpc", {
    method: "POST",
    headers: {
      "content-type": "application/json",
      authorization: `Bearer ${process.env.GATEWAY_TOKEN}`,
    },
    body: JSON.stringify({
      method: "runs.create",
      params: {
        workflow: "slack-triage",
        input: {
          channel: form.get("channel_id"),
          user: form.get("user_id"),
          text: form.get("text"),
        },
      },
    }),
  });

  return c.text("Review started.");
});
```

For interactive Slack flows, a common pattern is:

- start a run with `runs.create`
- wait in the workflow with `<Signal>` or `<WaitForEvent>`
- deliver button clicks, form submissions, or thread replies with `signals.send`

## When To Use Gateway vs Other Server Modes

| Use case | Best fit |
| --- | --- |
| One local run, maybe with a tiny HTTP wrapper | [`--serve`](/integrations/serve) |
| Loading workflow files over REST by path | [`startServer()`](/integrations/server) |
| Long-lived bots, dashboards, approvals, signals, and cron | `Gateway` |

## Next Steps

- [GitHub Bot](/integrations/github-bot)
- [HTTP Server](/integrations/server)
- [Serve Mode](/integrations/serve)
- [Runtime Events](/runtime/events)
- [External Workflows](/integrations/external-workflows)

---

## External Workflows

> Build Smithers workflows from non-TSX sources — including Python scripts — using the host node JSON protocol and Pydantic schema auto-discovery.
> Source: https://smithers.sh/integrations/external-workflows

Smithers workflows are normally written in TSX. The external workflow API lets you drive the same engine from any process that can read stdin and write JSON to stdout. The TypeScript side handles agents, schemas, and database setup; the external process owns the build logic.

Python is the first-class external runtime. The `createPythonWorkflow` function wires everything together automatically.

## Import

```ts
import {
  createExternalSmithers,
  createPythonWorkflow,
  pydanticSchemaToZod,
  serializeCtx,
  hostNodeToReact,
} from "smithers-orchestrator/external";
```

---

## Host Node JSON Protocol

The bridge between an external process and the Smithers engine is the `HostNodeJson` type. Every time the engine calls the build function, it passes a serialized context on stdin and expects a `HostNodeJson` tree on stdout.

### HostNodeJson

```ts
type HostNodeJson =
  | {
      kind: "element";
      tag: string;
      props: Record<string, string>;
      rawProps: Record<string, any>;
      children: HostNodeJson[];
    }
  | { kind: "text"; text: string };
```

Each `element` node maps 1:1 to a JSX component (`Task`, `Approval`, `Signal`, etc.). The `tag` field is the component name as a string. `rawProps` carries the full prop values including non-string types; `props` carries the string-serialized version used for display.

### SerializedCtx

The engine serializes the current `SmithersCtx` before invoking the build function:

```ts
type SerializedCtx = {
  runId: string;
  iteration: number;
  iterations: Record<string, number>;
  input: any;
  outputs: OutputSnapshot;
};
```

The external process receives this as JSON on stdin, uses it to decide which nodes to emit, and writes a `HostNodeJson` tree to stdout.

### Agent Reference Resolution

String agent references in `rawProps.agent` are resolved back to live `AgentLike` objects before the tree reaches the engine. If a referenced agent name is not in the registry, the engine throws `UNKNOWN_AGENT` with the available agent names.

```ts
// External process emits:
{ kind: "element", tag: "Task", rawProps: { agent: "claude" }, ... }

// TypeScript side resolves "claude" → actual AgentLike before rendering
```

---

## createExternalSmithers

The low-level factory. Use this when your build function is already written in TypeScript (e.g., wrapping a non-Python subprocess or a WASM module).

```ts
import { createExternalSmithers } from "smithers-orchestrator/external";

const workflow = createExternalSmithers({
  schemas: {
    analysis: z.object({ summary: z.string(), score: z.number() }),
  },
  agents: { claude: myClaudeAgent },
  buildFn: (ctx: SerializedCtx): HostNodeJson => {
    // Return a host node tree based on ctx
    return {
      kind: "element",
      tag: "Task",
      props: { id: "analyze" },
      rawProps: { id: "analyze", agent: myClaudeAgent },
      children: [],
    };
  },
});
```

### ExternalSmithersConfig

```ts
type ExternalSmithersConfig<S extends Record<string, z.ZodObject<any>>> = {
  schemas: S;
  agents: Record<string, AgentLike>;
  buildFn: (ctx: SerializedCtx) => HostNodeJson;
  dbPath?: string;
};
```

| Option | Type | Default | Description |
|---|---|---|---|
| `schemas` | `Record<string, ZodObject>` | required | Zod schemas for output tables |
| `agents` | `Record<string, AgentLike>` | required | Agent registry for ref resolution |
| `buildFn` | `(ctx: SerializedCtx) => HostNodeJson` | required | Synchronous build function |
| `dbPath` | `string` | ephemeral temp dir | Path for the SQLite database |

### Ephemeral SQLite Database

When `dbPath` is omitted, `createExternalSmithers` provisions an ephemeral SQLite database in a temp directory (`os.tmpdir()/smithers-ext-*/smithers.db`). WAL mode and a 5-second busy timeout are applied automatically. The database is closed on process exit. Pass an explicit `dbPath` for durable storage across restarts.

### serializeCtx

Serialize a live `SmithersCtx` to a `SerializedCtx` for passing to the build function or an external process:

```ts
import { serializeCtx } from "smithers-orchestrator/external";

const serialized = serializeCtx(ctx);
// { runId, iteration, iterations, input, outputs }
```

### hostNodeToReact

Convert a `HostNodeJson` tree to React elements, resolving string agent references:

```ts
import { hostNodeToReact } from "smithers-orchestrator/external";

const element = hostNodeToReact(hostNode, agents);
```

Throws `UNKNOWN_AGENT` if a referenced agent name is not present in the `agents` map.

---

## Python Integration

`createPythonWorkflow` is the recommended entry point for Python-defined workflows. It combines schema auto-discovery, subprocess management, and the host node protocol into a single call.

### Setup

Smithers uses [uv](https://github.com/astral-sh/uv) to run Python scripts. Install uv and ensure it is on `PATH`:

```bash
curl -LsSf https://astral.sh/uv/install.sh | sh
```

Your Python script must read a `SerializedCtx` JSON from stdin and write a `HostNodeJson` tree to stdout:

```python
import json
import sys

def run(ctx: dict) -> dict:
    return {
        "kind": "element",
        "tag": "Task",
        "props": {"id": "analyze"},
        "rawProps": {"id": "analyze", "agent": "claude"},
        "children": [],
    }

if __name__ == "__main__":
    ctx = json.loads(sys.stdin.read())
    print(json.dumps(run(ctx)))
```

### createPythonWorkflow

```ts
import { createPythonWorkflow } from "smithers-orchestrator/external";

const workflow = createPythonWorkflow({
  scriptPath: "./workflow.py",
  agents: { claude: myClaudeAgent },
});
```

Schemas are auto-discovered from the Python script's Pydantic models (see [Schema Auto-Discovery](#schema-auto-discovery) below). Pass explicit Zod schemas to skip discovery:

```ts
const workflow = createPythonWorkflow({
  scriptPath: "./workflow.py",
  agents: { claude: myClaudeAgent },
  schemas: {
    analysis: z.object({ summary: z.string(), score: z.number() }),
  },
});
```

### Configuration

```ts
type PythonWorkflowConfig = {
  scriptPath: string;
  agents: Record<string, AgentLike>;
  schemas?: Record<string, z.ZodObject<any>>;
  dbPath?: string;
  cwd?: string;
  timeoutMs?: number;
  env?: Record<string, string>;
};
```

| Option | Type | Default | Description |
|---|---|---|---|
| `scriptPath` | `string` | required | Path to the Python script (relative to `cwd`) |
| `agents` | `Record<string, AgentLike>` | required | Agent registry |
| `schemas` | `Record<string, ZodObject>` | auto-discovered | Zod schemas; omit to auto-discover from Pydantic |
| `dbPath` | `string` | ephemeral | SQLite database path |
| `cwd` | `string` | `process.cwd()` | Working directory for subprocess |
| `timeoutMs` | `number` | `30000` | Per-invocation timeout in milliseconds |
| `env` | `Record<string, string>` | `process.env` | Additional environment variables |

### Build Subprocess

Each time the engine calls the build function, Smithers spawns `uv run <scriptPath>` synchronously. The serialized context is passed on stdin; the process must write `HostNodeJson` to stdout and exit with code `0`.

Exit code non-zero, no output, or invalid JSON all throw `EXTERNAL_BUILD_FAILED`. Timeout throws `EXTERNAL_BUILD_FAILED` with a timeout message. Stderr is captured and included in the error details.

### Build Output Validation

The host node output is validated for a `kind` field before reaching the engine. The minimal valid output is:

```json
{ "kind": "text", "text": "hello" }
```

Or an element node:

```json
{
  "kind": "element",
  "tag": "Task",
  "props": {},
  "rawProps": { "id": "step1", "agent": "claude" },
  "children": []
}
```

---

## Schema Auto-Discovery

When `schemas` is omitted from `createPythonWorkflow`, Smithers runs the script with `--schemas` and parses the JSON output as a map of schema names to JSON Schema objects.

In your Python script, handle `--schemas` to emit Pydantic model schemas:

```python
import json
import sys
from pydantic import BaseModel

class Analysis(BaseModel):
    summary: str
    score: float

SCHEMAS = {"analysis": Analysis}

if __name__ == "__main__":
    if "--schemas" in sys.argv:
        print(json.dumps({
            name: model.model_json_schema()
            for name, model in SCHEMAS.items()
        }))
    else:
        ctx = json.loads(sys.stdin.read())
        # ... build and print HostNodeJson
```

Schema discovery runs once at startup. The discovered schemas are converted to Zod using `pydanticSchemaToZod` and passed to `createExternalSmithers`.

---

## Pydantic Schema Conversion

`pydanticSchemaToZod` converts a Pydantic v2 JSON Schema (from `model.model_json_schema()`) to a Zod object schema.

```ts
import { pydanticSchemaToZod } from "smithers-orchestrator/external";

const zodSchema = pydanticSchemaToZod(analysis.model_json_schema());
```

### Supported Patterns

| Pydantic Pattern | Zod Output |
|---|---|
| `type: "string"` with `minLength`/`maxLength`/`pattern` | `z.string().min().max().regex()` |
| `type: "number"` / `type: "integer"` with `minimum`/`maximum` | `z.number().int().min().max()` |
| `type: "boolean"` | `z.boolean()` |
| `type: "array"` with `items` | `z.array(...)` |
| `type: "object"` with `properties` + `required` | `z.object(...)` with optional non-required fields |
| `enum: [...]` on a string field | `z.enum([...])` |
| `anyOf: [T, {type: "null"}]` (Optional) | `T.nullable()` |
| `allOf: [A, B]` | `z.intersection(A, B)` |
| `oneOf: [A, B, ...]` | `z.union([A, B, ...])` |
| `$ref: "#/$defs/ModelName"` | Resolved inline (circular refs become `z.any()`) |
| `default: value` | `.default(value)` |
| `description: "..."` | `.describe("...")` |

### $ref Resolution

Pydantic places nested models under `$defs`. `pydanticSchemaToZod` resolves `#/$defs/ModelName` references inline using a JSON Pointer walk. Circular references are detected and collapsed to `z.any()` to prevent infinite recursion.

### nullable anyOf Collapse

Pydantic represents `Optional[T]` as `anyOf: [T, {type: "null"}]`. The converter detects this two-variant pattern and collapses it to `T.nullable()` for clean column mapping in the SQLite schema.

### allOf Intersection

`allOf` with a single entry is unwrapped directly. Multiple entries produce `z.intersection(A, z.intersection(B, ...))`.

---

## Full Example

```ts
// workflow.ts
import { createPythonWorkflow } from "smithers-orchestrator/external";
import { Anthropic } from "@anthropic-ai/sdk";

const claude = new Anthropic();

export default createPythonWorkflow({
  scriptPath: "./workflow.py",
  cwd: import.meta.dir,
  agents: { claude },
  timeoutMs: 60_000,
  env: { ANTHROPIC_API_KEY: process.env.ANTHROPIC_API_KEY! },
});
```

```python
# workflow.py
import json
import sys
from pydantic import BaseModel

class BugReport(BaseModel):
    title: str
    severity: str
    description: str

SCHEMAS = {"bugReport": BugReport}

def build(ctx: dict) -> dict:
    has_report = bool(ctx["outputs"].get("bugReport"))
    return {
        "kind": "element",
        "tag": "Task",
        "props": {"id": "triage"},
        "rawProps": {
            "id": "triage",
            "agent": "claude",
            "output": "bugReport",
        },
        "children": [
            {
                "kind": "text",
                "text": f"Triage this issue: {ctx['input'].get('description', '')}",
            }
        ],
    }

if __name__ == "__main__":
    if "--schemas" in sys.argv:
        print(json.dumps({k: v.model_json_schema() for k, v in SCHEMAS.items()}))
    else:
        ctx = json.loads(sys.stdin.read())
        print(json.dumps(build(ctx)))
```

---

## IDE Integration

> Control the Smithers IDE from workflows and tools — open files, show diffs, run terminals, ask users questions, and display overlays via smithers-ctl.
> Source: https://smithers.sh/integrations/ide

Smithers can talk to a running IDE instance through `smithers-ctl`, a CLI binary that the IDE installs and keeps on `PATH`. When both the binary and the right environment signals are present, workflows gain access to editor-native UI: file navigation, diff previews, terminal tabs, chat overlays, and webviews.

---

## Supported IDEs

The IDE integration is backed by `smithers-ctl`. Any IDE that ships `smithers-ctl` and sets the correct environment signals is supported. The environment detection checks five signals in order:

| Signal | Value |
|---|---|
| `SMITHERS_IDE` | `1`, `true`, or `yes` |
| `SMITHERS_CTL_ACTIVE` | `1`, `true`, or `yes` |
| `SMITHERS_SESSION_KIND` | `ide` |
| `TERM_PROGRAM` | `smithers` |
| `__CFBundleIdentifier` | contains `smithers` (macOS app bundles) |

At least one signal must be active for the environment to be considered live. The binary must also be executable on `PATH` (or at an absolute path if configured).

---

## Import

```ts
import {
  getSmithersIdeAvailability,
  isSmithersIdeAvailable,
  createSmithersIdeService,
  createSmithersIdeLayer,
  createAvailableSmithersIdeCli,
  openFile,
  openDiff,
  showOverlay,
  runTerminal,
  askUser,
  openWebview,
  SmithersIdeService,
  type SmithersIdeAvailability,
  type SmithersIdeServiceConfig,
} from "smithers-orchestrator/ide";
```

---

## Availability

### isSmithersIdeAvailable

Quick boolean check. Resolves `true` only when the binary is found and at least one environment signal is active.

```ts
const available = await isSmithersIdeAvailable();
// true | false
```

### getSmithersIdeAvailability

Full availability report with the reason and which signals fired.

```ts
const availability = await getSmithersIdeAvailability();

if (availability.available) {
  console.log("IDE found at:", availability.binaryPath);
  console.log("Active signals:", availability.signals);
} else {
  console.log("Not available:", availability.reason);
  // "binary-missing" | "environment-inactive"
}
```

### SmithersIdeAvailability

```ts
type SmithersIdeAvailability =
  | {
      available: true;
      binaryAvailable: true;
      binaryPath: string;
      environmentActive: true;
      reason: "available";
      signals: readonly string[];
    }
  | {
      available: false;
      binaryAvailable: boolean;
      binaryPath: string | null;
      environmentActive: boolean;
      reason: "binary-missing" | "environment-inactive";
      signals: readonly string[];
    };
```

`reason` distinguishes the two failure modes. `binary-missing` means `smithers-ctl` was not found on `PATH`. `environment-inactive` means the binary exists but none of the environment signals are set, which usually means the process is running outside the IDE.

---

## Configuration

All service constructors accept an optional `SmithersIdeServiceConfig`:

```ts
type SmithersIdeServiceConfig = {
  command?: string;          // Default: "smithers-ctl"
  cwd?: string;              // Default: process.cwd()
  env?: Record<string, string | undefined>;  // Default: process.env
  idleTimeoutMs?: number;    // Default: 2000
  maxOutputBytes?: number;   // Default: 200000
  timeoutMs?: number;        // Default: 10000
};
```

| Option | Default | Description |
|---|---|---|
| `command` | `"smithers-ctl"` | Binary name or absolute path |
| `cwd` | `process.cwd()` | Working directory for subprocess |
| `env` | `process.env` | Environment for subprocess |
| `idleTimeoutMs` | `2000` | Idle timeout in milliseconds |
| `maxOutputBytes` | `200000` (200 KB) | Max captured stdout/stderr |
| `timeoutMs` | `10000` | Hard timeout per command in milliseconds |

---

## Service API

### createSmithersIdeService

Returns a `SmithersIdeServiceApi` with all IDE operations as Effect-returning methods.

```ts
const service = createSmithersIdeService({ timeoutMs: 5000 });

const result = await Effect.runPromise(
  service.openFile("/src/index.ts", 42, 1),
);
```

### SmithersIdeServiceApi

```ts
type SmithersIdeServiceApi = {
  config: SmithersIdeResolvedConfig;
  detectAvailability: () => Effect.Effect<SmithersIdeAvailability>;
  openFile: (path: string, line?: number, column?: number) =>
    Effect.Effect<SmithersIdeOpenFileResult, SmithersError>;
  openDiff: (content: string) =>
    Effect.Effect<SmithersIdeOpenDiffResult, SmithersError>;
  showOverlay: (type: SmithersIdeOverlayType, options: SmithersIdeOverlayOptions) =>
    Effect.Effect<SmithersIdeOverlayResult, SmithersError>;
  runTerminal: (command: string, cwd?: string) =>
    Effect.Effect<SmithersIdeRunTerminalResult, SmithersError>;
  askUser: (prompt: string) =>
    Effect.Effect<SmithersIdeAskUserResult, SmithersError>;
  openWebview: (url: string) =>
    Effect.Effect<SmithersIdeOpenWebviewResult, SmithersError>;
};
```

---

## API Reference

### openFile

Open a file in the IDE, optionally jumping to a line and column.

```ts
openFile(path: string, line?: number, column?: number)
  => Effect.Effect<SmithersIdeOpenFileResult, SmithersError>
```

`column` requires `line`. Passing `column` without `line` fails with `INVALID_INPUT`.

```ts
// Open a file
service.openFile("/src/utils.ts");

// Jump to line 100
service.openFile("/src/utils.ts", 100);

// Jump to line 100, column 5
service.openFile("/src/utils.ts", 100, 5);
```

Invokes: `smithers-ctl open <path> [+line[:col]]`

```ts
type SmithersIdeOpenFileResult = {
  args: readonly string[];
  column: number | null;
  command: string;
  exitCode: number | null;
  line: number | null;
  opened: boolean;
  path: string;
  stderr: string;
  stdout: string;
};
```

### openDiff

Open a unified diff preview in the IDE.

```ts
openDiff(content: string)
  => Effect.Effect<SmithersIdeOpenDiffResult, SmithersError>
```

```ts
service.openDiff(`--- a/src/index.ts
+++ b/src/index.ts
@@ -1,3 +1,4 @@
 import { foo } from "./foo";
+import { bar } from "./bar";
`);
```

Invokes: `smithers-ctl diff show --content <content>`

```ts
type SmithersIdeOpenDiffResult = {
  args: readonly string[];
  command: string;
  exitCode: number | null;
  opened: boolean;
  stderr: string;
  stdout: string;
};
```

### showOverlay

Show an overlay in the IDE.

```ts
showOverlay(type: SmithersIdeOverlayType, options: SmithersIdeOverlayOptions)
  => Effect.Effect<SmithersIdeOverlayResult, SmithersError>
```

```ts
type SmithersIdeOverlayType = "chat" | "progress" | "panel";

type SmithersIdeOverlayOptions = {
  message: string;
  title?: string;
  position?: "top" | "center" | "bottom";
  duration?: number;   // seconds
  percent?: number;    // 0–100, for "progress" type
};
```

```ts
// Progress bar at 60%
service.showOverlay("progress", {
  message: "Running tests...",
  title: "Test Suite",
  percent: 60,
  position: "bottom",
});

// Chat message
service.showOverlay("chat", {
  message: "Deployment complete.",
  duration: 5,
});
```

Invokes: `smithers-ctl overlay --type <type> --message <message> [--title ...] [--position ...] [--duration ...] [--percent ...]`

```ts
type SmithersIdeOverlayResult = {
  args: readonly string[];
  command: string;
  exitCode: number | null;
  overlayId: string | null;
  shown: boolean;
  stderr: string;
  stdout: string;
  type: SmithersIdeOverlayType;
};
```

### runTerminal

Run a command in a new IDE terminal tab.

```ts
runTerminal(command: string, cwd?: string)
  => Effect.Effect<SmithersIdeRunTerminalResult, SmithersError>
```

```ts
service.runTerminal("npm test", "/workspace/my-project");
```

Invokes: `smithers-ctl terminal [--cwd <cwd>] run <command>`

```ts
type SmithersIdeRunTerminalResult = {
  args: readonly string[];
  command: string;
  cwd: string | null;
  exitCode: number | null;
  launched: boolean;
  status: string;
  stderr: string;
  stdout: string;
  terminalCommand: string;
};
```

### askUser

Prompt the user with a chat overlay and return when the overlay is shown. This is a shim — it displays the prompt via `showOverlay("chat", ...)` and returns immediately with `status: "prompted"`. The actual user response must be collected through the IDE's chat interface.

```ts
askUser(prompt: string)
  => Effect.Effect<SmithersIdeAskUserResult, SmithersError>
```

```ts
service.askUser("Which environment should I deploy to?");
```

```ts
type SmithersIdeAskUserResult = {
  args: readonly string[];
  command: string;
  exitCode: number | null;
  overlayId: string | null;
  prompt: string;
  status: "prompted";
  stderr: string;
  stdout: string;
};
```

### openWebview

Open a URL in an IDE webview tab.

```ts
openWebview(url: string)
  => Effect.Effect<SmithersIdeOpenWebviewResult, SmithersError>
```

```ts
service.openWebview("https://smithers.dev/runs/smi_abc123");
```

Invokes: `smithers-ctl webview open <url>`

```ts
type SmithersIdeOpenWebviewResult = {
  args: readonly string[];
  command: string;
  exitCode: number | null;
  opened: boolean;
  stderr: string;
  stdout: string;
  tabId: string | null;
  url: string;
};
```

---

## Effect Layer

Use `createSmithersIdeLayer` to provide `SmithersIdeService` as an Effect Layer, then use the module-level Effect constructors (`openFile`, `openDiff`, etc.) that read from the service via `Context.Tag`.

```ts
import { Effect, Layer } from "effect";
import {
  createSmithersIdeLayer,
  openFile,
  showOverlay,
  SmithersIdeService,
} from "smithers-orchestrator/ide";

const IdeLayer = createSmithersIdeLayer({ timeoutMs: 8000 });

const program = Effect.gen(function* () {
  yield* openFile("/src/index.ts", 1);
  yield* showOverlay("chat", { message: "Opened index.ts" });
});

Effect.runPromise(Effect.provide(program, IdeLayer));
```

The module-level functions (`openFile`, `openDiff`, `showOverlay`, `runTerminal`, `askUser`, `openWebview`) each call `Effect.flatMap(SmithersIdeService, ...)` and require `SmithersIdeService` in the context.

### SmithersIdeService Tag

```ts
class SmithersIdeService extends Context.Tag("SmithersIdeService")<
  SmithersIdeService,
  SmithersIdeServiceApi
>() {}
```

---

## MCP CLI Namespace

`createSmithersIdeCli` returns a CLI object with all six IDE tools registered under the `smithers-ide` namespace. This is the integration point for MCP tool servers.

```ts
import { createSmithersIdeCli, SMITHERS_IDE_TOOL_NAMES } from "smithers-orchestrator/ide";

const cli = createSmithersIdeCli({ timeoutMs: 10_000 });
// cli is an incur Cli instance with all six tools
```

### Tool Names

```ts
const SMITHERS_IDE_TOOL_NAMES = [
  "smithers_ide_open_file",
  "smithers_ide_open_diff",
  "smithers_ide_show_overlay",
  "smithers_ide_run_terminal",
  "smithers_ide_ask_user",
  "smithers_ide_open_webview",
] as const;
```

### Tool Schemas

| Tool | Required Args | Optional Args |
|---|---|---|
| `smithers_ide_open_file` | `path: string` | `line: number`, `col: number` |
| `smithers_ide_open_diff` | `content: string` | — |
| `smithers_ide_show_overlay` | `type: "chat"\|"progress"\|"panel"`, `message: string` | `title`, `position`, `duration`, `percent` |
| `smithers_ide_run_terminal` | `cmd: string` | `cwd: string` |
| `smithers_ide_ask_user` | `prompt: string` | — |
| `smithers_ide_open_webview` | `url: string` (URL) | — |

---

## IDE-Gated CLI Commands

`createAvailableSmithersIdeCli` is a convenience wrapper that returns the CLI only when the IDE is available, and `null` otherwise. Use it to conditionally register IDE tools:

```ts
import { createAvailableSmithersIdeCli } from "smithers-orchestrator/ide";

const ideCli = await createAvailableSmithersIdeCli();
if (ideCli) {
  // Register IDE tools with your MCP server
  server.registerCli(ideCli);
}
```

---

## Error Handling

All operations throw `SmithersError` on failure.

| Code | Cause |
|---|---|
| `INVALID_INPUT` | Empty `path`, `content`, `command`, or `url`; or `column` provided without `line` |
| `PROCESS_SPAWN_FAILED` | `smithers-ctl` not found on `PATH` or not executable |
| `TOOL_COMMAND_FAILED` | `smithers-ctl` exited with a non-zero exit code |

`PROCESS_SPAWN_FAILED` with `ENOENT` produces a human-readable message: `smithers-ctl is not installed or not on PATH`.

---

## Full Example

```ts
import { Effect } from "effect";
import {
  getSmithersIdeAvailability,
  createSmithersIdeService,
} from "smithers-orchestrator/ide";

async function runIdeWorkflow() {
  const availability = await getSmithersIdeAvailability();

  if (!availability.available) {
    console.log(`IDE not available: ${availability.reason}`);
    return;
  }

  const service = createSmithersIdeService();

  await Effect.runPromise(
    Effect.gen(function* () {
      // Open the entrypoint
      yield* service.openFile("/workspace/src/index.ts", 1);

      // Show a progress overlay while working
      yield* service.showOverlay("progress", {
        message: "Analyzing codebase...",
        title: "Smithers",
        percent: 0,
        position: "bottom",
      });

      // Run tests in a new terminal tab
      yield* service.runTerminal("npm test", "/workspace");

      // Open a diff when done
      yield* service.openDiff(generatedDiff);

      // Ask the user what to do next
      yield* service.askUser("Tests passed. Should I open a PR?");
    }),
  );
}
```

---

## GitHub Bot

> Build a GitHub App backed by Smithers workflows using webhooks, gateway RPC, approvals, signals, comments, PR creation, and checks.
> Source: https://smithers.sh/integrations/github-bot

Smithers does not ship a turnkey GitHub bot server.

What it does give you is the orchestration layer you actually want behind one:

- durable [workflows](/concepts/workflows-overview)
- [approvals](/concepts/approvals)
- resumable [signals](/runtime/events)
- [gateway RPC](/integrations/gateway)
- [built-in tools](/integrations/tools) and custom tools

The usual shape is:

1. a GitHub App receives webhooks
2. your webhook receiver verifies the GitHub signature
3. the receiver calls the Smithers gateway
4. workflows do the actual work

## Architecture

Typical wiring is a [gateway](/integrations/gateway) client over [WebSocket](https://developer.mozilla.org/en-US/docs/Web/API/WebSocket) or `POST /rpc`, with workflows calling the [GitHub API](https://docs.github.com/en/rest).

```txt
GitHub App
  -> webhook receiver
  -> Gateway (WebSocket or POST /rpc)
  -> Smithers workflows
  -> GitHub API client/tools
```

Use `runs.create` when the webhook should start fresh work.

Use `signals.send` when the [workflow](/concepts/workflows-overview) is already running and is waiting for a follow-up event such as:

- a maintainer comment
- a label change
- a check completion
- a merge event

## 1. Create The GitHub App

In GitHub App settings, configure:

- A webhook URL pointing at your receiver
- A webhook secret for signature verification
- Installation permissions that match what your workflows will do

Common permissions for a PR review bot:

- `Contents: Read`
- `Pull requests: Read and write`
- `Issues: Read and write`
- `Checks: Read and write`
- `Metadata: Read`

Common webhook subscriptions:

- `pull_request`
- `issues`
- `issue_comment`
- `pull_request_review_comment`
- `check_suite`
- `check_run`

`@mentions` usually arrive through comment events:

- `issue_comment` for issue comments and PR conversation comments
- `pull_request_review_comment` for inline review comments

## 2. Start The Gateway

The [gateway](/integrations/gateway) is the remote control surface your bot talks to.

```tsx
/** @jsxImportSource smithers-orchestrator */
import {
  Gateway,
  Sequence,
  Task,
  Workflow,
  createSmithers,
} from "smithers-orchestrator";
import { z } from "zod";

const { smithers, outputs } = createSmithers({
  review: z.object({
    summary: z.string(),
    commentBody: z.string(),
    shouldBlock: z.boolean(),
  }),
  publish: z.object({
    commentId: z.number().nullable(),
    checkRunId: z.number().nullable(),
  }),
});

export const reviewWorkflow = smithers((ctx) => (
  <Workflow name="github-pr-review">
    <Sequence>
      <Task id="review" output={outputs.review} agent={reviewer}>
        {`Review PR #${ctx.input.pullNumber} in ${ctx.input.owner}/${ctx.input.repo}.`}
      </Task>

      <Task id="publish" output={outputs.publish}>
        {async () => {
          return {
            commentId: null,
            checkRunId: null,
          };
        }}
      </Task>
    </Sequence>
  </Workflow>
));

const gateway = new Gateway({
  auth: {
    mode: "token",
    tokens: {
      [process.env.GATEWAY_TOKEN!]: {
        role: "github-bot",
        scopes: ["*"],
        userId: "bot:github",
      },
    },
  },
});

gateway.register("github-pr-review", reviewWorkflow);
await gateway.listen({ port: 7331 });
```

## 3. Receive Webhooks And Call The Gateway

The receiver can be any HTTP framework. It does two jobs:

1. verify the GitHub signature
2. translate webhook payloads into gateway RPC calls

```ts
import { Hono } from "hono";

const app = new Hono();

app.post("/github/webhooks", async (c) => {
  const event = c.req.header("x-github-event");
  const deliveryId = c.req.header("x-github-delivery");
  const payload = await c.req.json();

  if (
    event === "pull_request" &&
    ["opened", "synchronize", "reopened"].includes(payload.action)
  ) {
    await fetch("http://127.0.0.1:7331/rpc", {
      method: "POST",
      headers: {
        "content-type": "application/json",
        authorization: `Bearer ${process.env.GATEWAY_TOKEN}`,
      },
      body: JSON.stringify({
        method: "runs.create",
        params: {
          workflow: "github-pr-review",
          input: {
            owner: payload.repository.owner.login,
            repo: payload.repository.name,
            pullNumber: payload.pull_request.number,
            installationId: payload.installation?.id,
            sender: payload.sender.login,
            deliveryId,
          },
        },
      }),
    });
  }

  return c.json({ ok: true });
});
```

That pattern is enough for "start a workflow when a PR opens."

For richer bots, keep the run alive and resume it with [signals](/runtime/events) instead of starting over.

## 4. Map GitHub Events To Workflow Actions

| GitHub event | Typical gateway action | Why |
| --- | --- | --- |
| `pull_request.opened` | `runs.create` | Start initial review or triage |
| `pull_request.synchronize` | `runs.create` or `signals.send` | Re-review after new commits |
| `issues.opened` | `runs.create` | Triage issues, route labels, draft replies |
| `issue_comment.created` | `signals.send` | Continue an existing run after a maintainer or user reply |
| `pull_request_review_comment.created` | `signals.send` | React to inline feedback |
| `check_run.completed` | `signals.send` | Resume a workflow waiting on CI or another bot |

### Handling `@mentions`

`@mentions` are usually just filtered comment events:

```ts
if (
  event === "issue_comment" &&
  typeof payload.comment?.body === "string" &&
  payload.comment.body.includes("@smithers")
) {
  await fetch("http://127.0.0.1:7331/rpc", {
    method: "POST",
    headers: {
      "content-type": "application/json",
      authorization: `Bearer ${process.env.GATEWAY_TOKEN}`,
    },
    body: JSON.stringify({
      method: "signals.send",
      params: {
        runId: findRunIdForThread(payload),
        signalName: "github.comment",
        correlationId: `pr-${payload.issue.number}`,
        data: {
          body: payload.comment.body,
          author: payload.comment.user.login,
          url: payload.comment.html_url,
        },
      },
    }),
  });
}
```

That lets you build workflows that wait for commands like:

- `@smithers re-run review`
- `@smithers summarize the blockers`
- `@smithers draft the changelog`

## 5. Call The GitHub API From Workflows

There are two common approaches.

### Custom Tools With Octokit

This is the cleanest option when you only need a handful of GitHub actions.

```ts
import { App } from "octokit";
import { defineTool } from "smithers-orchestrator";
import { z } from "zod";

const githubApp = new App({
  appId: process.env.GITHUB_APP_ID!,
  privateKey: process.env.GITHUB_APP_PRIVATE_KEY!,
});

async function installationClient(installationId: number) {
  return await githubApp.getInstallationOctokit(installationId);
}

export const listPullFiles = defineTool({
  name: "github.list_pull_files",
  description: "List changed files in a pull request",
  schema: z.object({
    installationId: z.number(),
    owner: z.string(),
    repo: z.string(),
    pullNumber: z.number(),
  }),
  async execute({ installationId, owner, repo, pullNumber }) {
    const octokit = await installationClient(installationId);
    const { data } = await octokit.rest.pulls.listFiles({
      owner,
      repo,
      pull_number: pullNumber,
    });
    return data.map((file) => ({
      filename: file.filename,
      status: file.status,
      patch: file.patch ?? null,
    }));
  },
});
```

### OpenAPI Tools

If you already have a GitHub REST spec or a thin proxy with a smaller OpenAPI surface, `createOpenApiTools()` works well too. That is most useful when you want the agent to choose among many GitHub operations without hand-wrapping each one.

## 6. Creating PRs, Posting Comments, And Running Checks

These are the most common bot mutations.

### Post A Comment

```ts
const octokit = await installationClient(ctx.input.installationId);
const review = ctx.output(outputs.review, { nodeId: "review" });

const comment = await octokit.rest.issues.createComment({
  owner: ctx.input.owner,
  repo: ctx.input.repo,
  issue_number: ctx.input.pullNumber,
  body: review.commentBody,
});
```

### Create A Pull Request

```ts
await octokit.rest.pulls.create({
  owner,
  repo,
  title: "smithers: apply review fixes",
  head: "smithers/fix-branch",
  base: "main",
  body: "Automated fixes generated by Smithers.",
});
```

### Create And Update A Check Run

```ts
const check = await octokit.rest.checks.create({
  owner,
  repo,
  name: "smithers/review",
  head_sha: sha,
  status: "in_progress",
});

await octokit.rest.checks.update({
  owner,
  repo,
  check_run_id: check.data.id,
  status: "completed",
  conclusion: "success",
  output: {
    title: "Review complete",
    summary: "No blocking issues found.",
  },
});
```

Checks are a good place to surface machine-readable state while comments carry the longer narrative.

## 7. Example Workflow: Review A Pull Request

This example keeps the workflow simple:

- fetch PR data through [tools](/integrations/tools)
- ask an [agent](/concepts/agents-and-tools) to review it
- optionally publish a comment

```tsx
/** @jsxImportSource smithers-orchestrator */
import {
  Approval,
  Sequence,
  Task,
  Workflow,
  approvalDecisionSchema,
  createSmithers,
} from "smithers-orchestrator";
import { z } from "zod";

const { smithers, outputs } = createSmithers({
  review: z.object({
    summary: z.string(),
    commentBody: z.string(),
    shouldBlock: z.boolean(),
  }),
  approval: approvalDecisionSchema,
  publish: z.object({
    commentId: z.number().nullable(),
  }),
});

export default smithers((ctx) => {
  const approval = ctx.outputMaybe(outputs.approval, { nodeId: "approve-comment" });

  return (
    <Workflow name="github-pr-review">
      <Sequence>
        <Task id="review" output={outputs.review} agent={reviewer}>
          {`Review pull request #${ctx.input.pullNumber} in ${ctx.input.owner}/${ctx.input.repo}.

Use the available GitHub tools to inspect the diff and return:
- summary
- commentBody
- shouldBlock`}
        </Task>

        <Approval
          id="approve-comment"
          output={outputs.approval}
          request={{
            title: "Post review comment to GitHub?",
            summary: "Human can edit or deny before the bot writes back.",
          }}
          onDeny="continue"
        />

        {approval?.approved ? (
          <Task id="publish" output={outputs.publish}>
            {async () => {
              const review = ctx.output(outputs.review, { nodeId: "review" });
              const octokit = await installationClient(ctx.input.installationId);
              const result = await octokit.rest.issues.createComment({
                owner: ctx.input.owner,
                repo: ctx.input.repo,
                issue_number: ctx.input.pullNumber,
                body: review.commentBody,
              });

              return { commentId: result.data.id };
            }}
          </Task>
        ) : null}
      </Sequence>
    </Workflow>
  );
});
```

## 8. Long-Lived PR Workflows

The [gateway](/integrations/gateway) gets especially useful when your bot needs to pause and resume rather than fire one request and exit.

Typical pattern:

1. Start a run on `pull_request.opened`
2. Wait on [`<Signal>` or `<WaitForEvent>`](/runtime/events) for later comments, CI updates, or labels
3. Deliver those events with `signals.send`
4. Keep the run's context and outputs intact between events

That gives you a real conversation and state machine around the PR without writing one by hand.

## Next Steps

- [Gateway](/integrations/gateway)
- [Common External Tools](/integrations/common-tools)
- [Runtime Events](/runtime/events)
- [Approvals](/concepts/approvals)
- [Built-in Tools](/integrations/tools)

---

## PI Plugin Client

> A lightweight HTTP client for interacting with a Smithers server from any process, with functions for starting runs, streaming events, and managing approvals.
> Source: https://smithers.sh/integrations/pi-plugin

Lightweight HTTP client for a running Smithers server. Start/resume runs, stream events, manage approvals, query status.

## Import

```ts
import {
  runWorkflow,
  resume,
  approve,
  deny,
  streamEvents,
  getStatus,
  getFrames,
  cancel,
  listRuns,
} from "smithers-orchestrator/pi-plugin";
```

## Defaults

All functions accept optional `baseUrl` and `apiKey`:

| Parameter | Default |
|---|---|
| `baseUrl` | `http://127.0.0.1:7331` |
| `apiKey` | `undefined` (no auth header) |

If `apiKey` is provided, it is sent as `Authorization: Bearer <token>`.

---

## Functions

### runWorkflow

Start a new workflow run.

```ts
async function runWorkflow(args: {
  workflowPath: string;
  input: unknown;
  runId?: string;
  baseUrl?: string;
  apiKey?: string;
}): Promise<{ runId: string }>
```

| Parameter | Type | Required | Description |
|---|---|---|---|
| `workflowPath` | `string` | Yes | Path to `.tsx` workflow file on the server |
| `input` | `unknown` | Yes | Workflow input data |
| `runId` | `string` | No | Custom run ID |
| `baseUrl` | `string` | No | Server URL |
| `apiKey` | `string` | No | Auth token |

```ts
const run = await runWorkflow({
  workflowPath: "./workflows/bugfix.tsx",
  input: { description: "Auth tokens expire silently" },
  apiKey: "sk-my-token",
});
console.log(run.runId);
```

---

### resume

Resume a paused or failed run.

```ts
async function resume(args: {
  workflowPath: string;
  runId: string;
  baseUrl?: string;
  apiKey?: string;
}): Promise<{ runId: string }>
```

| Parameter | Type | Required | Description |
|---|---|---|---|
| `workflowPath` | `string` | Yes | Path to `.tsx` workflow file |
| `runId` | `string` | Yes | Run ID to resume |
| `baseUrl` | `string` | No | Server URL |
| `apiKey` | `string` | No | Auth token |

Calls `POST /v1/runs` with `resume: true`.

---

### approve

Approve a node waiting for human approval.

```ts
async function approve(args: {
  runId: string;
  nodeId: string;
  iteration?: number;
  note?: string;
  baseUrl?: string;
  apiKey?: string;
}): Promise<{ runId: string }>
```

| Parameter | Type | Required | Description |
|---|---|---|---|
| `runId` | `string` | Yes | Run ID |
| `nodeId` | `string` | Yes | Node ID |
| `iteration` | `number` | No | Loop iteration (default: `0`) |
| `note` | `string` | No | Approval note |
| `baseUrl` | `string` | No | Server URL |
| `apiKey` | `string` | No | Auth token |

---

### deny

Deny a node waiting for human approval.

```ts
async function deny(args: {
  runId: string;
  nodeId: string;
  iteration?: number;
  note?: string;
  baseUrl?: string;
  apiKey?: string;
}): Promise<{ runId: string }>
```

| Parameter | Type | Required | Description |
|---|---|---|---|
| `runId` | `string` | Yes | Run ID |
| `nodeId` | `string` | Yes | Node ID |
| `iteration` | `number` | No | Loop iteration (default: `0`) |
| `note` | `string` | No | Denial reason |
| `baseUrl` | `string` | No | Server URL |
| `apiKey` | `string` | No | Auth token |

---

### streamEvents

Stream lifecycle events via SSE. Returns `AsyncIterable<SmithersEvent>`.

```ts
async function* streamEvents(args: {
  runId: string;
  baseUrl?: string;
  apiKey?: string;
}): AsyncIterable<SmithersEvent>
```

| Parameter | Type | Required | Description |
|---|---|---|---|
| `runId` | `string` | Yes | Run ID |
| `baseUrl` | `string` | No | Server URL |
| `apiKey` | `string` | No | Auth token |

```ts
for await (const event of streamEvents({ runId: "smi_abc123" })) {
  if (event.type === "RunFinished") break;
  if (event.type === "RunFailed") break;
  if (event.type === "NodeWaitingApproval") {
    console.log(`Node ${event.nodeId} needs approval.`);
  }
}
```

Connects to `GET /v1/runs/:runId/events`. Keep-alive comments are filtered. Completes when the stream closes.

---

### getStatus

Get run status and summary.

```ts
async function getStatus(args: {
  runId: string;
  baseUrl?: string;
  apiKey?: string;
}): Promise<{
  runId: string;
  workflowName: string;
  status: string;
  startedAtMs: number | null;
  finishedAtMs: number | null;
  summary: Record<string, number>;
}>
```

```ts
const status = await getStatus({ runId: "smi_abc123" });
console.log(status.status);  // "running" | "finished" | "failed" | ...
console.log(status.summary); // { finished: 3, pending: 2, "in-progress": 1 }
```

---

### getFrames

List render frames for a run.

```ts
async function getFrames(args: {
  runId: string;
  tail?: number;
  baseUrl?: string;
  apiKey?: string;
}): Promise<any[]>
```

| Parameter | Type | Required | Description |
|---|---|---|---|
| `runId` | `string` | Yes | Run ID |
| `tail` | `number` | No | Max frames (default: `20`) |
| `baseUrl` | `string` | No | Server URL |
| `apiKey` | `string` | No | Auth token |

---

### cancel

Cancel a running workflow.

```ts
async function cancel(args: {
  runId: string;
  baseUrl?: string;
  apiKey?: string;
}): Promise<{ runId: string }>
```

---

### listRuns

List all runs. Requires server `db`.

```ts
async function listRuns(args?: {
  limit?: number;
  status?: string;
  baseUrl?: string;
  apiKey?: string;
}): Promise<any[]>
```

| Parameter | Type | Required | Description |
|---|---|---|---|
| `limit` | `number` | No | Max runs (default: server default of 50) |
| `status` | `string` | No | Filter by status |
| `baseUrl` | `string` | No | Server URL |
| `apiKey` | `string` | No | Auth token |

---

## Complete Example

```ts
import {
  runWorkflow,
  streamEvents,
  approve,
  getStatus,
} from "smithers-orchestrator/pi-plugin";

const apiKey = process.env.SMITHERS_API_KEY;

const run = await runWorkflow({
  workflowPath: "./workflows/deploy.tsx",
  input: { branch: "main", environment: "staging" },
  apiKey,
});

for await (const event of streamEvents({ runId: run.runId, apiKey })) {
  switch (event.type) {
    case "NodeStarted":
      console.log(`Task ${event.nodeId} started (attempt ${event.attempt})`);
      break;
    case "NodeFinished":
      console.log(`Task ${event.nodeId} completed`);
      break;
    case "NodeWaitingApproval":
      await approve({
        runId: run.runId,
        nodeId: event.nodeId,
        note: "Auto-approved for staging",
        apiKey,
      });
      break;
    case "RunFinished":
      console.log("Deployment complete.");
      break;
    case "RunFailed":
      console.error("Deployment failed:", event.error);
      break;
  }
}

const status = await getStatus({ runId: run.runId, apiKey });
console.log(`Final status: ${status.status}`);
```

---

## PI CLI Extension

> The Smithers extension for the PI coding agent — MCP bridge, live dashboard, event ticker, approval flow, and slash commands that bring full workflow observability into PI's interactive session.
> Source: https://smithers.sh/integrations/pi-extension

The Smithers PI extension runs inside the PI coding agent and gives it complete Smithers knowledge and workflow observability. It does three things at once: it bridges the Smithers MCP server so the agent has live tool access, it injects the Smithers documentation into every system prompt so the LLM understands the full API, and it adds a set of slash commands and UI widgets that let you monitor and control runs without leaving the terminal.

## Installation

The extension ships with `smithers-orchestrator`. Install PI and add it to your `PATH`, then register the extension:

```bash
pi --extension smithers-orchestrator/pi-plugin/extension.ts
```

Or add it permanently to your PI config:

```json
{
  "extensions": ["smithers-orchestrator/pi-plugin/extension.ts"]
}
```

PI must be able to reach a running Smithers server. The default URL is `http://127.0.0.1:7331`. Start one with `smithers serve` before launching PI.

## Setup and configuration

The extension registers two CLI flags you can pass when launching PI:

| Flag | Short | Default | Description |
|---|---|---|---|
| `--smithers-url` | `-u` | `http://127.0.0.1:7331` | Smithers server base URL |
| `--smithers-key` | `-k` | — | API key for authenticated servers |

The API key is also read from `SMITHERS_API_KEY` in the environment. Setting the env var is preferred over passing the key on the command line.

```bash
SMITHERS_API_KEY=sk-my-token pi --extension smithers-orchestrator/pi-plugin/extension.ts
```

---

## MCP bridge

On session start, the extension spawns `smithers --mcp` as a child process and connects to it over stdio using the Model Context Protocol. It then discovers all available Smithers tools and registers them as PI agent tools named `smithers_<tool>`. This means the agent can call Smithers operations directly as tool calls, with full TypeBox parameter schemas derived from the MCP server's JSON Schema definitions.

The MCP connection is held open for the entire PI session and torn down cleanly on `session_shutdown`.

### Dynamic tool registration

Tools are registered dynamically at session start based on whatever the live MCP server exposes. If the server has a newer or older tool surface, the agent gets exactly that set — no hardcoded list. The `tui` tool is filtered out since the extension itself handles TUI interactions.

Tool call rendering in the PI chat history shows `smithers <tool-name> key=value …` for calls and a colored success/error indicator for results.

### MCP tool reference

| PI tool name | Underlying MCP tool | Description |
|---|---|---|
| `smithers_run` | `run` | Start a new workflow run |
| `smithers_status` | `status` | Get run status and node summary |
| `smithers_approve` | `approve` | Approve a node waiting for human sign-off |
| `smithers_deny` | `deny` | Deny a node waiting for human sign-off |
| `smithers_cancel` | `cancel` | Cancel a running workflow |
| `smithers_list` | `list` | List runs for a workflow from the database |
| `smithers_frames` | `frames` | List render frames (DAG snapshots) for a run |
| `smithers_graph` | `graph` | Preview the execution graph without running |
| `smithers_resume` | `resume` | Resume a paused or crashed run |
| `smithers_revert` | `revert` | Revert the workspace to a previous task attempt |

The agent is guided by the injected system prompt to prefer these tools over shelling out, and to follow the standard workflow pattern: discover with `smithers_list`, run with `smithers_run`, monitor with `smithers_status`, approve with `smithers_approve`, and debug with `smithers_frames` or `smithers_graph`.

---

## System prompt injection

On every `before_agent_start` event the extension augments the system prompt with two pieces of content.

### Full Smithers documentation

The extension loads `docs/llms-full.txt` (~125k tokens) from the package. It tries several candidate paths and falls back to `llms.txt` if the full file is not found. This gives the LLM complete knowledge of the Smithers API, component reference, and usage patterns.

Resolution order for `llms-full.txt`:

1. `<package-dir>/docs/llms-full.txt`
2. `<package-dir>/../docs/llms-full.txt`
3. `<cwd>/docs/llms-full.txt`
4. `<cwd>/node_modules/smithers-orchestrator/docs/llms-full.txt`

Then the same four paths for the `llms.txt` fallback.

### Active run context

If there is a currently active run being tracked by the extension, a short context block is appended:

```
## Active Run Context
Run: smi_abc123 (deploy.tsx)
Status: waiting-approval
Nodes waiting approval: checkDeploy
```

This tells the LLM exactly what is happening right now without it having to call a tool first.

---

## UI components

The following components are active whenever PI is in interactive (TUI) mode.

### Header

Displays `smithers · workflow orchestrator` branding at the top of every PI session. Rendered via `ctx.ui.setHeader`.

### Status bar

Shows a live count of active runs, pending approvals, completed runs, and failed runs in the form `smithers: 2 active · 1 awaiting approval · 3 done`. Updates on every background poll cycle and after every event stream message. Disappears when no runs have been tracked.

A separate approval indicator status entry (`smithers-approval`) appears at the start of each conversation turn when one or more nodes are waiting for approval: `⏳ N node(s) awaiting approval`.

### Event ticker

When a run is being watched via the event stream, the 5 most recent events are rendered as a widget above the editor input. Each line shows a timestamp and the event message. Updates in real time as events arrive.

```
  14:32:01  Run started
  14:32:04  analyzeCode started (attempt 1)
  14:32:11  analyzeCode → bash()
  14:32:14  analyzeCode → bash() ✓
  14:32:18  analyzeCode finished
```

### Message renderer

A custom PI message renderer (`smithers-event`) formats workflow event messages in the chat history. It uses status color coding and status icons, and shows the run ID when the message is expanded.

### Auto-polling

Active runs are polled every 10 seconds for status updates even when no event stream is attached. Each poll also fetches Prometheus metrics from the server's `/metrics` endpoint. Polling starts on `session_start` and stops on `session_shutdown`.

---

## Slash commands

All commands are available as `/smithers-<name>` inside the PI interactive session.

| Command | Argument | Description |
|---|---|---|
| `/smithers` | — | Open the full-screen dashboard overlay |
| `/smithers-runs` | — | List all tracked runs; select one to make it active |
| `/smithers-watch` | `[runId]` | Attach a live SSE event stream to a run |
| `/smithers-run` | `[workflow]` | Start a workflow and auto-attach the event stream |
| `/smithers-resume` | `[workflow]` | Resume a paused or crashed run |
| `/smithers-approve` | — | Interactive approve/deny flow for waiting nodes |
| `/smithers-cancel` | `[runId]` | Cancel a running workflow with confirmation |
| `/smithers-status` | `[runId]` | Show detailed status; defaults to active run |
| `/smithers-logs` | `[nodeId]` | Scrollable log viewer for a node's output |
| `/smithers-frames` | `[runId]` | Browse render frames (DAG snapshots) for a run |
| `/smithers-graph` | `[workflow]` | Preview the execution graph without running |
| `/smithers-revert` | `[workflow]` | Revert workspace to a previous task attempt |
| `/smithers-list` | `[workflow]` | List runs from the database for a workflow |
| `/smithers-metrics` | — | Live Prometheus metrics overlay |

Arguments in brackets are optional. When omitted the command prompts interactively or falls back to the current active run.

Run ID arguments support tab-completion from the set of tracked runs.

### /smithers — Dashboard overlay

Full-screen TUI overlay with four tabs:

- **1 Overview** — all tracked runs with status icon, workflow name, run ID prefix, elapsed time, and a node state summary
- **2 Nodes** — per-node breakdown for the selected run with state, duration, and the last two lines of captured output
- **3 Events** — the 20 most recent events for the selected run, color-coded by type
- **4 Errors** — all errors collected for the selected run

Navigate with `j`/`k` to select runs, `1`–`4` to switch tabs, `q` or `Esc` to close.

### /smithers-runs

Shows a selection list of all tracked runs. Selecting a run makes it the active run for the event ticker and status bar.

### /smithers-watch

Attaches a live SSE stream to a run. Events update the ticker widget and the dashboard in real time. Accepts a run ID as an argument or prompts if omitted.

### /smithers-run

Prompts for a workflow path and optional input JSON, calls `smithers_run`, and automatically attaches the event stream to the new run.

### /smithers-resume

Prompts for a workflow path and run ID, calls `smithers_resume`, and auto-attaches the event stream to the resumed run.

### /smithers-approve

Collects all nodes in the `waiting-approval` state across all tracked runs and presents a selection list. After choosing a node the user picks Approve or Deny and can add an optional note. Posts directly to the Smithers server API.

### /smithers-cancel

Presents a selection list of active runs or accepts a run ID argument. Requires confirmation before calling `POST /v1/runs/:runId/cancel`.

### /smithers-status

Calls `smithers_status` and pastes the result into the editor. Defaults to the active run when no ID is provided.

### /smithers-logs

Opens a scrollable log viewer for a specific node's captured output. Prompts for node selection if no argument is given. Keys: `j`/`k` to scroll, `g`/`G` for top/bottom, `q`/`Esc` to close.

### /smithers-frames

Opens a split view: a frame list on top and frame detail below. The detail pane shows the task index, mounted task IDs, any note, and an XML snippet of the DAG snapshot. Navigate with `j`/`k`, close with `q`/`Esc`.

### /smithers-graph

Calls `smithers_graph` and pastes the JSON execution graph into the editor. Accepts a workflow path as an argument or prompts if omitted.

### /smithers-revert

Interactive revert flow: prompts for workflow path, run ID, node ID, and attempt number, then requires confirmation before calling `smithers_revert`. Warns that this modifies the working directory.

### /smithers-list

Calls `smithers_list` for the given workflow and pastes the JSON run list into the editor.

### /smithers-metrics

Full-screen TUI overlay showing live Prometheus metrics fetched from `GET /metrics`:

**Counters and gauges** — runs started/finished/failed/cancelled/resumed, active runs and nodes, node retries, token counts (input/output/cache read/cache write/reasoning), tool calls and errors, cache hits/misses, approvals requested/granted/denied/pending, hot reloads, HTTP requests, DB retries, scheduler queue depth, events emitted, process uptime and memory.

**Histograms (p50 / p99 / avg / count)** — run duration, node duration, attempt duration, tool duration, tokens per call, prompt and response size, approval wait time, scheduler wait time, DB query latency, HTTP request latency, hot reload duration, VCS duration.

Press `r` to toggle raw Prometheus text output. Close with `q` or `Esc`.

---

## JSON Schema to TypeBox conversion

When the extension registers MCP tools as PI tools, it converts the MCP server's JSON Schema `inputSchema` to TypeBox schemas on the fly. String, number/integer, boolean, and array properties are mapped to their TypeBox equivalents. Required properties become mandatory fields; optional properties are wrapped in `Type.Optional`. This allows PI's parameter validation and UI to work without any hardcoded schema definitions.

---

## Relationship to the PI plugin client

This extension and the `smithers-orchestrator/pi-plugin` HTTP client are complementary, not overlapping. The extension is loaded by the PI CLI itself and adds agent tools, UI, and slash commands. The HTTP client (`pi-plugin`) is a lightweight module you import in your own code — a PI extension or any Node process — to drive the Smithers server API from TypeScript. Both are part of the same package.

See the [PI Plugin Client](/integrations/pi-plugin) page for the HTTP client API reference.

---

## PI Integration

> Use PI as a Smithers workflow CLI backend and understand how PI extensibility composes with Smithers declarative orchestration.
> Source: https://smithers.sh/integrations/pi-integration

Smithers provides deterministic orchestration (workflow graph, approvals, retries, durable state). PI provides adaptive agent capabilities (providers, models, extensions, skills, prompt templates). Use both when you need deterministic execution with flexible agent behavior.

## Integration Modes

### 1) PI as Workflow Agent

```tsx
import { PiAgent } from "smithers-orchestrator";

const pi = new PiAgent({
  provider: "openai",
  model: "gpt-5.2-codex",
  mode: "text",
});

{/* outputs comes from createSmithers() */}
<Task id="implementation" output={outputs.implementation} agent={pi}>
  {`Implement feature X and explain tradeoffs.`}
</Task>
```

`PiAgent` supports all PI CLI flags: provider/model, tools, extensions, skills, prompt templates, themes, and sessions. Text mode uses `--print` by default; JSON/RPC modes set `--mode` and omit `--print`.

PI sessions are first-class hijack targets. `smithers hijack <runId> --target pi` reopens the PI session for local steering.

### 2) PI Server Client

Use `pi-plugin` to drive Smithers server APIs from a PI extension or any Node process:

```ts
import { runWorkflow, approve, streamEvents } from "smithers-orchestrator/pi-plugin";
```

### 3) Hybrid: PI Extensibility + Smithers Orchestration

- Keep orchestration in Smithers (`<Sequence>`, `<Parallel>`, `<Branch>`, `<Loop>`).
- Run adaptive logic in PI tasks (extensions/skills/provider overrides).

Patterns:

1. PI skill-driven coding task inside a Smithers `<Task>`.
2. PI extension command that starts/resumes Smithers workflows via server API or pi-plugin.
3. Smithers workflow output persisted to SQLite and consumed by later PI-assisted tasks.

## Hijacking PI Sessions

PI is a native-session hijack backend.

- Live run: Smithers watches PI's event stream, waits between blocking tool calls, then hands off the session.
- Finished/cancelled run: Smithers reopens the latest persisted PI session.
- Relaunch uses the stored session ID: `pi --session <id>`.
- Clean exit resumes the workflow automatically.

Session persistence:

- `PiAgent` defaults `noSession` to `true` for one-shot calls.
- For workflow hijack/resume/streaming, Smithers keeps session persistence enabled automatically.
- No need to set `mode: "json"` manually for hijack support.

## Setup

1. Install PI CLI and add to `PATH`.
2. Configure PI credentials via env/config (prefer over CLI args for API keys).
3. Instantiate `PiAgent` with explicit options in workflows.
4. For server-driven workflows, use `pi-plugin`.

```bash
pi --version
bun run test
```

## Design Guidance

| Use `PiAgent` tasks when | Use Smithers-native tasks when |
|---|---|
| You need PI capabilities inside deterministic workflows | You need strict reproducibility and narrow tool contracts |
| You want PI calls as auditable workflow steps | |

## Limitations

Chat-provider integration lives in host applications, not this repo.

---

## Ecosystem

> Community projects built on Smithers.
> Source: https://smithers.sh/integrations/ecosystem

Third-party tools and workflows built by the community.

## Burns

Workspace-first local control plane for Smithers. Single UI for authoring, running, and supervising workflows across repositories. Register repos, launch runs, stream events, inspect frames, handle approvals.

- React web app, ElectroBun desktop shell, or headless CLI
- AI-assisted workflow authoring via local agent CLIs
- SQLite-backed workspace registry

<Card title="Burns" icon="github" href="https://github.com/l3wi/burns">
  github.com/l3wi/burns
</Card>

## Ralphinho

Multi-agent development workflows. Two independent workflows:

- **Ralphinho** (scheduled-work) -- decomposes RFC into work units, runs tier-based quality pipelines (implement, test, review), lands via merge queue with CI verification.
- **Improvinho** (review-discovery) -- three parallel discovery lenses (refactoring, type safety, architecture), deduplicates findings. Optionally pushes to Linear.

Requires Bun and Jujutsu (`jj`). Supports Claude and Codex agents.

<Card title="Ralphinho" icon="github" href="https://github.com/enitrat/ralphinho">
  github.com/enitrat/ralphinho
</Card>

## Cairo Coder

AI-powered Cairo smart contract generator. RAG pipeline (DSPy) converting natural language to Cairo contracts for Starknet. Uses Smithers with Claude and Codex agents.

<Card title="Cairo Coder" icon="github" href="https://github.com/KasarLabs/cairo-coder">
  github.com/KasarLabs/cairo-coder
</Card>

## Agentix

Opinionated RFC-to-production orchestrator. Multi-phase pipelines: research, plan, implement, test, review. Role-based agents, conflict-aware merge queues, security/performance gates. DDD + BDD + TDD by default.

<Card title="Agentix" icon="github" href="https://github.com/AbdelStark/agentix">
  github.com/AbdelStark/agentix
</Card>

## Era

Generic multi-phase development workflow engine. Research, Plan, Implementation, Testing, Review, Fix, Final Review pipeline with outer Loop. Role-based agents, intelligent caching, dual-layer prompts.

<Card title="Era" icon="github" href="https://github.com/ClementWalter/era">
  github.com/ClementWalter/era
</Card>

## Local Isolated Ralph

Kubernetes-native Smithers workflow runner. Runs workflows as isolated K8s Jobs and CronJobs via k3s/k3d. Sandboxed container execution.

<Card title="Local Isolated Ralph" icon="github" href="https://github.com/SamuelLHuber/local-isolated-ralph">
  github.com/SamuelLHuber/local-isolated-ralph
</Card>

---

## runWorkflow

> Execute a Smithers workflow programmatically and get back a durable RunResult.
> Source: https://smithers.sh/runtime/run-workflow

```ts
import { createSmithers, Task, runWorkflow } from "smithers-orchestrator";
import { z } from "zod";

const { Workflow, smithers, outputs } = createSmithers({
  analysis: z.object({ summary: z.string() }),
});

const workflow = smithers((ctx) => (
  <Workflow name="example">
    <Task id="analyze" output={outputs.analysis} agent={myAgent}>
      {`Analyze: ${ctx.input.description}`}
    </Task>
  </Workflow>
));

const result = await runWorkflow(workflow, {
  input: { description: "Auth tokens expire silently" },
});

console.log(result.status); // "finished" | "failed" | "cancelled" | "continued" | "waiting-approval" | ...
```

## Signature

```ts
function runWorkflow<Schema>(
  workflow: SmithersWorkflow<Schema>,
  opts: RunOptions,
): Promise<RunResult>;
```

## RunOptions

| Field | Type | Default | Description |
|---|---|---|---|
| `input` | `Record<string, unknown>` | **(required)** | Input data for the run. Must be JSON. `runId` is injected automatically. |
| `runId` | `string` | Auto-generated | Deterministic run ID. |
| `resume` | `boolean` | `false` | Resume an existing run. Requires `runId`. Skips completed tasks. |
| `maxConcurrency` | `number` | `4` | Max parallel tasks. Also respects per-group `<Parallel maxConcurrency>`. |
| `onProgress` | `(e: SmithersEvent) => void` | `undefined` | Callback for every lifecycle event. See [Events](/runtime/events). |
| `signal` | `AbortSignal` | `undefined` | Cancel the run. Finishes with status `"cancelled"`. |
| `workflowPath` | `string` | `undefined` | Path to the workflow `.tsx` file. Resolves default `rootDir`. |
| `rootDir` | `string` | Workflow file's directory | Sandbox root for file-system tools (`read`, `edit`, `write`, `bash`, `grep`). |
| `logDir` | `string \| null` | `.smithers/executions/<runId>/logs` | NDJSON event log directory. `null` disables logging. Relative paths resolve from `rootDir`. |
| `allowNetwork` | `boolean` | `false` | Permit network requests from `bash`. |
| `maxOutputBytes` | `number` | `200000` | Max bytes per tool call output. Truncated beyond this. |
| `toolTimeoutMs` | `number` | `60000` | Wall-clock timeout (ms) per tool call. |
| `hot` | `boolean \| HotReloadOptions` | `undefined` | Enable hot-reload. `true` for defaults, or pass `HotReloadOptions`. |
| `cliAgentToolsDefault` | `"all" \| "explicit-only"` | `"all"` | Default tool access policy for CLI-backed agents. When `"explicit-only"`, agents can only use tools listed in the task's `allowTools` prop. Recommended `"explicit-only"` for production workflows. |
| `parentRunId` | `string` | `undefined` | Parent run ID for child workflow / subflow ancestry tracking. |
| `force` | `boolean` | `false` | Allow resume even when the run's owner process is still alive (overrides PID liveness check). |
| `auth` | `RunAuthContext` | `undefined` | Authentication context accessible as `ctx.auth` in the workflow. |

### HotReloadOptions

| Field | Type | Default | Description |
|---|---|---|---|
| `rootDir` | `string` | Auto-detect | Directory to watch for file changes. |
| `outDir` | `string` | `.smithers/hmr/<runId>` | Directory for generation overlays. |
| `maxGenerations` | `number` | `3` | Max overlay generations to keep. |
| `cancelUnmounted` | `boolean` | `false` | Cancel tasks unmounted after hot reload. |
| `debounceMs` | `number` | `100` | Debounce interval (ms) for file changes. |

## RunResult

```ts
type RunResult = {
  runId: string;
  status: "finished" | "failed" | "cancelled" | "continued" | "waiting-approval" | "waiting-event" | "waiting-timer";
  output?: unknown;
  error?: unknown;
};
```

| Field | Type | Description |
|---|---|---|
| `runId` | `string` | Run identifier (provided or auto-generated). |
| `status` | `string` | Terminal status. |
| `output` | `unknown` | Output rows, if the schema includes a key named `output`. See below. |
| `error` | `unknown` | Serialized error with `code`, `message`, and optional `details`. |

### `result.output`

`output` is populated only when the schema passed to `createSmithers()` has a key literally named `output`:

```ts
// result.output WILL be populated
const { Workflow, smithers, outputs } = createSmithers({
  output: z.object({ summary: z.string() }),
});
```

```ts
// result.output will be undefined
const { Workflow, smithers, outputs } = createSmithers({
  page: z.object({ title: z.string(), html: z.string() }),
});
```

Other schema keys (`page`, `analysis`, etc.) are persisted to SQLite but not returned on `result.output`. Query them directly:

```ts
import { Database } from "bun:sqlite";

const result = await runWorkflow(workflow, { input: {} });
const db = new Database("smithers.db", { readonly: true });
const rows = db.query(
  "SELECT * FROM page WHERE run_id = ? ORDER BY iteration DESC"
).all(result.runId);
db.close();
```

### Status Values

| Status | Meaning |
|---|---|
| `"finished"` | All tasks completed. |
| `"failed"` | A task failed after exhausting retries; `continueOnFail` not set. |
| `"cancelled"` | Cancelled via `AbortSignal` or hijack handoff. |
| `"continued"` | Run ended via `<ContinueAsNew>` -- a fresh run has started with carried state. |
| `"waiting-approval"` | A task requires human approval. Unblock with `smithers approve` or `smithers deny`. |
| `"waiting-event"` | A task is waiting for an external signal or event. Send via `signalRun()` or the CLI. |
| `"waiting-timer"` | A task is suspended until a durable timer fires. |

## Resuming a Run

Pass `resume: true` with the original `runId`. Smithers reads persisted state from SQLite, skips completed tasks, and continues from the first pending task.

```ts
const result = await runWorkflow(workflow, {
  input: {},
  runId: "my-run-123",
  resume: true,
});
```

- The original input row is loaded from the database; pass an empty object for `input`.
- Workflow path, file hash, and VCS metadata must match the current environment.
- In-progress attempts older than 15 minutes are automatically cancelled and retried.
- Tasks with valid persisted outputs are skipped.

## Hijacking and Resuming Agent State

Smithers persists agent continuation state:

- CLI-backed agents persist a native session ID (Claude, Codex, Gemini, PI, Kimi, Forge, or Amp).
- SDK-style agents persist conversation `messages`.

When a run is hijacked (CLI or TUI):

- `RunHijackRequested` and `RunHijacked` events are emitted.
- The run ends with status `"cancelled"`.
- The latest attempt metadata stores `hijackHandoff` plus `agentResume` or `agentConversation`.

On `resume: true`, Smithers reuses that persisted state instead of starting fresh. Smithers waits for a safe handoff point: between blocking tool calls for CLI agents, after durable message history for conversation-backed agents.

## Cancellation

```ts
const controller = new AbortController();

// Cancel after 5 minutes
setTimeout(() => controller.abort(), 5 * 60 * 1000);

const result = await runWorkflow(workflow, {
  input: { description: "Long task" },
  signal: controller.signal,
});

if (result.status === "cancelled") {
  console.log("Run was cancelled");
}
```

All in-progress attempts are marked cancelled in the database and `NodeCancelled` events are emitted.

## Event Monitoring

```ts
const result = await runWorkflow(workflow, {
  input: { description: "Fix bug" },
  onProgress: (event) => {
    switch (event.type) {
      case "NodeStarted":
        console.log(`Task ${event.nodeId} started (attempt ${event.attempt})`);
        break;
      case "NodeFinished":
        console.log(`Task ${event.nodeId} finished`);
        break;
      case "NodeFailed":
        console.error(`Task ${event.nodeId} failed:`, event.error);
        break;
      case "ApprovalRequested":
        console.log(`Task ${event.nodeId} needs approval`);
        break;
    }
  },
});
```

See [Events](/runtime/events) for the full event type list.

## Idle Sleep Prevention

On macOS, `runWorkflow` acquires a `caffeinate` lock to prevent idle sleep. Released on completion. No-op on other platforms.

## Error Handling

Unhandled engine exceptions mark the run `"failed"` and serialize into `RunResult.error`. Task-level failures are handled by retry and `continueOnFail` mechanisms.

Set `SMITHERS_DEBUG=1` to print engine errors to stderr.

## Related

- [Events](/runtime/events) -- All event types emitted during a run.
- [renderFrame](/runtime/render-frame) -- Preview the workflow graph without executing.
- [CLI](/cli/overview) -- Run workflows from the command line.
- [Resumability](/guides/resumability) -- How durable state and crash recovery work.

---

## renderFrame

> Render a workflow tree to a GraphSnapshot for visualization and debugging without executing any tasks.
> Source: https://smithers.sh/runtime/render-frame

`renderFrame` converts a workflow's JSX tree into a `GraphSnapshot` -- an XML representation and an ordered task list. It does not execute tasks, call agents, or modify the database.

## Usage

```ts
import { renderFrame } from "smithers-orchestrator";
import workflow from "./workflow";

const snapshot = await renderFrame(workflow, {
  runId: "preview",
  iteration: 0,
  input: { description: "Fix authentication bug" },
  outputs: {},
});

console.log(snapshot.frameNo);       // 0
console.log(snapshot.tasks.length);  // Number of tasks in the tree
console.log(snapshot.xml);           // XML tree representation
```

## Signature

```ts
function renderFrame<Schema>(
  workflow: SmithersWorkflow<Schema>,
  ctx: {
    runId: string;
    iteration: number;
    iterations?: Record<string, number>;
    input: object;
    outputs: object;
  },
  opts?: { baseRootDir?: string },
): Promise<GraphSnapshot>;
```

### Context Object

| Field | Type | Description |
|---|---|---|
| `runId` | `string` | Run ID for the snapshot. Any string for previews. |
| `iteration` | `number` | Current loop iteration (`0` for non-looping workflows). |
| `iterations` | `Record<string, number>` | Per-loop iteration counts, keyed by loop ID. Optional. |
| `input` | `object` | Input data the workflow expects from `ctx.input`. |
| `outputs` | `object` | Previously computed outputs by table name. `{}` for initial state. |

### Options

| Field | Type | Default | Description |
|---|---|---|---|
| `baseRootDir` | `string` | `undefined` | Base directory for resolving relative paths. |

## GraphSnapshot

```ts
type GraphSnapshot = {
  runId: string;
  frameNo: number;
  xml: XmlNode | null;
  tasks: TaskDescriptor[];
};
```

| Field | Type | Description |
|---|---|---|
| `runId` | `string` | Run ID from context. |
| `frameNo` | `number` | Always `0` for `renderFrame`. |
| `xml` | `XmlNode \| null` | Rendered XML tree, or `null` if empty. |
| `tasks` | `TaskDescriptor[]` | Tasks in execution order. |

### XmlNode

```ts
type XmlNode = XmlElement | XmlText;

type XmlElement = {
  kind: "element";
  tag: string;           // e.g. "smithers:workflow", "smithers:task"
  props: Record<string, string>;
  children: XmlNode[];
};

type XmlText = {
  kind: "text";
  text: string;
};
```

### TaskDescriptor

```ts
type TaskDescriptor = {
  nodeId: string;
  ordinal: number;
  iteration: number;
  ralphId?: string;
  dependsOn?: string[];
  needs?: Record<string, string>;
  worktreeId?: string;
  worktreePath?: string;
  worktreeBranch?: string;

  outputTable: Table | null;
  outputTableName: string;
  outputRef?: ZodObject<any>;
  outputSchema?: ZodObject<any>;

  parallelGroupId?: string;
  parallelMaxConcurrency?: number;

  needsApproval: boolean;
  approvalMode?: "gate" | "decision";
  approvalOnDeny?: "fail" | "continue" | "skip";
  skipIf: boolean;
  retries: number;
  retryPolicy?: RetryPolicy;
  timeoutMs: number | null;
  continueOnFail: boolean;
  cachePolicy?: CachePolicy;

  agent?: AgentLike | AgentLike[];
  prompt?: string;
  staticPayload?: unknown;
  computeFn?: () => unknown | Promise<unknown>;

  label?: string;
  meta?: Record<string, unknown>;
};
```

| Field | Description |
|---|---|
| `nodeId` | `id` prop from `<Task>`. |
| `ordinal` | Position in task list (0-indexed). |
| `iteration` | Loop iteration this task belongs to. |
| `ralphId` | Enclosing `<Loop>` ID, if any. |
| `dependsOn` | Node IDs this task depends on. |
| `needs` | Named dependencies. Keys are context keys, values are node IDs. |
| `worktreeId` | Git worktree ID. |
| `worktreePath` | Filesystem path to git worktree. |
| `worktreeBranch` | Branch name for git worktree. |
| `outputTable` | Drizzle table object for output. |
| `outputTableName` | String name of output table. |
| `outputRef` | Zod schema reference from `output` prop. |
| `outputSchema` | Zod schema for validating agent output. |
| `parallelGroupId` | Enclosing `<Parallel>` group ID. |
| `parallelMaxConcurrency` | Per-group concurrency limit. |
| `needsApproval` | Whether task requires human approval. |
| `approvalMode` | `"gate"` pauses before execution; `"decision"` records a decision. |
| `approvalOnDeny` | Behavior on denial: `"fail"`, `"continue"`, or `"skip"`. |
| `skipIf` | Whether task is skipped. |
| `retries` | Retry attempts on failure. |
| `retryPolicy` | Backoff config (`{ backoff?, initialDelayMs? }`). |
| `timeoutMs` | Per-task timeout in ms, or `null` for global default. |
| `continueOnFail` | Whether workflow continues on task failure. |
| `cachePolicy` | Cache config (`{ by?, version? }`). |
| `agent` | AI agent(s) assigned to this task. |
| `prompt` | Resolved prompt string. |
| `staticPayload` | Static output data (no-agent tasks). |
| `computeFn` | Callback for compute tasks. |
| `label` | Human-readable label. |
| `meta` | Arbitrary metadata. |

## Use Cases

### Previewing the Execution Graph

```ts
const snapshot = await renderFrame(workflow, {
  runId: "dry-run",
  iteration: 0,
  input: { description: "Preview" },
  outputs: {},
});

for (const task of snapshot.tasks) {
  console.log(`${task.ordinal}. [${task.nodeId}] -> ${task.outputTableName}`);
  if (task.needsApproval) console.log("   (requires approval)");
  if (task.skipIf) console.log("   (skipped)");
}
```

### Simulating Completed Outputs

```ts
// First render: no outputs
const snap1 = await renderFrame(workflow, {
  runId: "sim",
  iteration: 0,
  input: { description: "Bug" },
  outputs: {},
});

// Second render: simulate "analyze" completing
const snap2 = await renderFrame(workflow, {
  runId: "sim",
  iteration: 0,
  input: { description: "Bug" },
  outputs: {
    analyze: [{ runId: "sim", nodeId: "analyze", iteration: 0, summary: "Found null pointer" }],
  },
});
```

### CLI

```bash
smithers graph workflow.tsx --input '{"description": "Fix bug"}'
```

Prints the `GraphSnapshot` as JSON to stdout.

## Related

- [runWorkflow](/runtime/run-workflow) -- Execute the workflow.
- [Events](/runtime/events) -- Monitor execution progress.
- [Execution Model](/concepts/execution-model) -- The render-execute-persist loop.

---

## Events

> Subscribe to fine-grained lifecycle events emitted during workflow execution.
> Source: https://smithers.sh/runtime/events

Smithers emits typed `SmithersEvent` objects throughout a run. Subscribe via `onProgress` in `runWorkflow`, or read persisted events from NDJSON log files.

Events serve as the durable replay/audit log, correlate with structured logs through `runId`/`nodeId`/`attempt`, and drive built-in lifecycle counters. For OTLP export and Prometheus/Grafana setup, see [Observability](/guides/monitoring-logs).

## Subscribing

### onProgress Callback

```ts
import { runWorkflow } from "smithers-orchestrator";
import workflow from "./workflow";

await runWorkflow(workflow, {
  input: { description: "Fix bug" },
  onProgress: (event) => {
    console.log(`[${event.type}] at ${event.timestampMs}`);

    if (event.type === "NodeStarted") {
      console.log(`  node: ${event.nodeId}, attempt: ${event.attempt}`);
    }

    if (event.type === "NodeFailed") {
      console.error(`  node: ${event.nodeId}, error:`, event.error);
    }
  },
});
```

### NDJSON Log Files

Events are appended as JSON lines to:

```
.smithers/executions/<runId>/logs/stream.ndjson
```

```bash
# Watch events in real time
tail -f .smithers/executions/abc123/logs/stream.ndjson | jq .

# Filter for failures
cat .smithers/executions/abc123/logs/stream.ndjson | jq 'select(.type == "NodeFailed")'

# Count events by type
cat .smithers/executions/abc123/logs/stream.ndjson | jq -r .type | sort | uniq -c | sort -rn
```

Configure with `logDir` in `runWorkflow` or `--log-dir` / `--no-log` in the CLI.

## Event-Driven Metrics

| Event | Metric |
|---|---|
| `RunStarted` | `smithers.runs.total` |
| `NodeStarted` | `smithers.nodes.started` |
| `NodeFinished` | `smithers.nodes.finished` |
| `NodeFailed` | `smithers.nodes.failed` |
| Approval events | Approval counters |

`trackSmithersEvent` from `smithers-orchestrator/observability` exposes this mapping for custom integrations.

## Common Fields

Every `SmithersEvent`:

| Field | Type | Description |
|---|---|---|
| `type` | `string` | Event type discriminator. |
| `runId` | `string` | Run this event belongs to. |
| `timestampMs` | `number` | Unix timestamp in milliseconds. |

Node-scoped events add:

| Field | Type | Description |
|---|---|---|
| `nodeId` | `string` | Task node ID. |
| `iteration` | `number` | Loop iteration number. |

Attempt-scoped events add:

| Field | Type | Description |
|---|---|---|
| `attempt` | `number` | Attempt number (starts at 1). |

## Event Types

### Supervisor

#### SupervisorStarted

Emitted when the supervisor process starts polling for stale runs.

```ts
{
  type: "SupervisorStarted",
  runId: string,
  pollIntervalMs: number,
  staleThresholdMs: number,
  timestampMs: number,
}
```

`pollIntervalMs`: How often the supervisor checks for stale runs. `staleThresholdMs`: Age after which a run is considered stale.

#### SupervisorPollCompleted

Emitted after each supervisor poll cycle.

```ts
{
  type: "SupervisorPollCompleted",
  runId: string,
  staleCount: number,
  resumedCount: number,
  skippedCount: number,
  durationMs: number,
  timestampMs: number,
}
```

`staleCount`: Runs found to be stale. `resumedCount`: Runs successfully auto-resumed. `skippedCount`: Stale runs skipped (e.g. process still alive). `durationMs`: Wall time for this poll cycle.

### Run Lifecycle

#### RunStarted

Emitted once at the beginning of every run (including resumes).

```ts
{ type: "RunStarted", runId: string, timestampMs: number }
```

#### RunStatusChanged

```ts
{ type: "RunStatusChanged", runId: string, status: RunStatus, timestampMs: number }
```

`RunStatus`: `"running"` | `"waiting-approval"` | `"waiting-event"` | `"waiting-timer"` | `"finished"` | `"continued"` | `"failed"` | `"cancelled"`.

#### RunFinished

```ts
{ type: "RunFinished", runId: string, timestampMs: number }
```

#### RunFailed

```ts
{ type: "RunFailed", runId: string, error: unknown, timestampMs: number }
```

#### RunCancelled

```ts
{ type: "RunCancelled", runId: string, timestampMs: number }
```

#### RunAutoResumed

Emitted by the supervisor when a stale run is automatically restarted.

```ts
{
  type: "RunAutoResumed",
  runId: string,
  lastHeartbeatAtMs: number | null,
  staleDurationMs: number,
  timestampMs: number,
}
```

`lastHeartbeatAtMs`: Unix ms of the last recorded heartbeat, or `null` if no heartbeat was recorded. `staleDurationMs`: How long the run had been stale before resumption.

#### RunAutoResumeSkipped

Emitted when the supervisor decided not to resume a stale run.

```ts
{
  type: "RunAutoResumeSkipped",
  runId: string,
  reason: "pid-alive" | "missing-workflow" | "rate-limited",
  timestampMs: number,
}
```

`reason`: `"pid-alive"` — the original process is still running; `"missing-workflow"` — workflow file could not be located; `"rate-limited"` — resumption was throttled.

#### RunContinuedAsNew

Emitted when a long-running workflow continues as a fresh run, carrying forward state.

```ts
{
  type: "RunContinuedAsNew",
  runId: string,
  newRunId: string,
  iteration: number,
  carriedStateSize: number,
  ancestryDepth?: number,
  timestampMs: number,
}
```

`newRunId`: The run ID of the continuation. `carriedStateSize`: Byte size of the state passed to the new run. `ancestryDepth`: How many continuation hops have occurred (omitted on first continuation).

#### RunForked

Emitted when a run is forked from a parent run's snapshot for time-travel or branching.

```ts
{
  type: "RunForked",
  runId: string,
  parentRunId: string,
  parentFrameNo: number,
  branchLabel?: string,
  timestampMs: number,
}
```

`parentRunId`: The run this fork originated from. `parentFrameNo`: Frame number in the parent run where the fork was taken. `branchLabel`: Optional human-readable label for the branch.

#### ReplayStarted

Emitted when a run begins replaying from a parent run's snapshot.

```ts
{
  type: "ReplayStarted",
  runId: string,
  parentRunId: string,
  parentFrameNo: number,
  restoreVcs: boolean,
  timestampMs: number,
}
```

`parentRunId`: The run being replayed from. `parentFrameNo`: Snapshot frame to replay from. `restoreVcs`: Whether VCS state was restored as part of the replay.

### Frame Events

#### FrameCommitted

Emitted each time the engine renders a new frame.

```ts
{
  type: "FrameCommitted",
  runId: string,
  frameNo: number,
  xmlHash: string,
  timestampMs: number,
}
```

`xmlHash`: SHA-256 hex digest of the canonicalized XML tree.

### Snapshot

#### SnapshotCaptured

Emitted when the engine captures a point-in-time snapshot of the workflow frame, enabling time-travel and forking.

```ts
{
  type: "SnapshotCaptured",
  runId: string,
  frameNo: number,
  contentHash: string,
  timestampMs: number,
}
```

`frameNo`: The frame this snapshot was taken at. `contentHash`: Hash of the snapshot content, used to detect duplicate snapshots.

### Node Lifecycle

#### NodePending

Task identified, waiting to be scheduled.

```ts
{ type: "NodePending", runId: string, nodeId: string, iteration: number, timestampMs: number }
```

#### NodeStarted

```ts
{
  type: "NodeStarted",
  runId: string,
  nodeId: string,
  iteration: number,
  attempt: number,
  timestampMs: number,
}
```

#### NodeFinished

```ts
{
  type: "NodeFinished",
  runId: string,
  nodeId: string,
  iteration: number,
  attempt: number,
  timestampMs: number,
}
```

#### NodeFailed

```ts
{
  type: "NodeFailed",
  runId: string,
  nodeId: string,
  iteration: number,
  attempt: number,
  error: unknown,
  timestampMs: number,
}
```

#### NodeCancelled

```ts
{
  type: "NodeCancelled",
  runId: string,
  nodeId: string,
  iteration: number,
  attempt?: number,
  reason?: string,
  timestampMs: number,
}
```

`reason` may be `"unmounted"` if the task disappeared from the tree after re-render.

#### NodeSkipped

```ts
{ type: "NodeSkipped", runId: string, nodeId: string, iteration: number, timestampMs: number }
```

#### NodeRetrying

Fires before the next attempt starts.

```ts
{
  type: "NodeRetrying",
  runId: string,
  nodeId: string,
  iteration: number,
  attempt: number,
  timestampMs: number,
}
```

`attempt` is the upcoming attempt number.

#### NodeWaitingApproval

```ts
{
  type: "NodeWaitingApproval",
  runId: string,
  nodeId: string,
  iteration: number,
  timestampMs: number,
}
```

#### NodeWaitingTimer

Emitted when a node is suspended waiting for a timer to fire.

```ts
{
  type: "NodeWaitingTimer",
  runId: string,
  nodeId: string,
  iteration: number,
  firesAtMs: number,
  timestampMs: number,
}
```

`firesAtMs`: Unix ms when the timer is scheduled to fire.

### Approval Events

#### ApprovalRequested

```ts
{
  type: "ApprovalRequested",
  runId: string,
  nodeId: string,
  iteration: number,
  timestampMs: number,
}
```

#### ApprovalGranted

```ts
{
  type: "ApprovalGranted",
  runId: string,
  nodeId: string,
  iteration: number,
  timestampMs: number,
}
```

#### ApprovalAutoApproved

Emitted when an approval is granted automatically by a configured policy without human intervention.

```ts
{
  type: "ApprovalAutoApproved",
  runId: string,
  nodeId: string,
  iteration: number,
  timestampMs: number,
}
```

#### ApprovalDenied

```ts
{
  type: "ApprovalDenied",
  runId: string,
  nodeId: string,
  iteration: number,
  timestampMs: number,
}
```

### Tool Events

#### ToolCallStarted

```ts
{
  type: "ToolCallStarted",
  runId: string,
  nodeId: string,
  iteration: number,
  attempt: number,
  toolName: string,
  seq: number,
  timestampMs: number,
}
```

`seq`: sequential counter for tool calls within the attempt.

#### ToolCallFinished

```ts
{
  type: "ToolCallFinished",
  runId: string,
  nodeId: string,
  iteration: number,
  attempt: number,
  toolName: string,
  seq: number,
  status: "success" | "error",
  timestampMs: number,
}
```

### Output Events

#### NodeOutput

Streaming text from an agent.

```ts
{
  type: "NodeOutput",
  runId: string,
  nodeId: string,
  iteration: number,
  attempt: number,
  text: string,
  stream: "stdout" | "stderr",
  timestampMs: number,
}
```

### Timer Events

#### TimerCreated

Emitted when a durable timer is registered with the engine.

```ts
{
  type: "TimerCreated",
  runId: string,
  timerId: string,
  firesAtMs: number,
  timerType: "duration" | "absolute",
  timestampMs: number,
}
```

`timerId`: Stable identifier for this timer. `firesAtMs`: Unix ms when the timer will fire. `timerType`: `"duration"` — created from a relative delay; `"absolute"` — created from a specific wall-clock time.

#### TimerFired

Emitted when a timer fires and resumes its waiting node.

```ts
{
  type: "TimerFired",
  runId: string,
  timerId: string,
  firesAtMs: number,
  firedAtMs: number,
  delayMs: number,
  timestampMs: number,
}
```

`firesAtMs`: Scheduled fire time. `firedAtMs`: Actual fire time. `delayMs`: Difference between actual and scheduled fire time; non-zero indicates scheduler lag.

#### TimerCancelled

Emitted when a timer is cancelled before it fires.

```ts
{
  type: "TimerCancelled",
  runId: string,
  timerId: string,
  timestampMs: number,
}
```

### Task Heartbeat Events

#### TaskHeartbeat

Emitted periodically by long-running tasks to signal they are still alive.

```ts
{
  type: "TaskHeartbeat",
  runId: string,
  nodeId: string,
  iteration: number,
  attempt: number,
  hasData: boolean,
  dataSizeBytes: number,
  intervalMs?: number,
  timestampMs: number,
}
```

`hasData`: Whether the heartbeat carries a checkpoint payload. `dataSizeBytes`: Byte size of any checkpoint data. `intervalMs`: Configured heartbeat interval, if set.

#### TaskHeartbeatTimeout

Emitted when a task fails to send a heartbeat within its configured timeout window.

```ts
{
  type: "TaskHeartbeatTimeout",
  runId: string,
  nodeId: string,
  iteration: number,
  attempt: number,
  lastHeartbeatAtMs: number,
  timeoutMs: number,
  timestampMs: number,
}
```

`lastHeartbeatAtMs`: Unix ms of the last heartbeat received before timeout. `timeoutMs`: The configured timeout duration.

### Sandbox Events

#### SandboxCreated

Emitted when a sandboxed execution environment is provisioned.

```ts
{
  type: "SandboxCreated",
  runId: string,
  sandboxId: string,
  runtime: "bubblewrap" | "docker" | "codeplane",
  configJson: string,
  timestampMs: number,
}
```

`sandboxId`: Unique identifier for this sandbox instance. `runtime`: The isolation backend used. `configJson`: JSON-serialized sandbox configuration.

#### SandboxShipped

Emitted when the initial code bundle has been uploaded to the sandbox.

```ts
{
  type: "SandboxShipped",
  runId: string,
  sandboxId: string,
  runtime: "bubblewrap" | "docker" | "codeplane",
  bundleSizeBytes: number,
  timestampMs: number,
}
```

`bundleSizeBytes`: Size of the uploaded bundle in bytes.

#### SandboxHeartbeat

Emitted periodically while a sandbox is executing to indicate liveness.

```ts
{
  type: "SandboxHeartbeat",
  runId: string,
  sandboxId: string,
  remoteRunId?: string,
  progress?: number,
  timestampMs: number,
}
```

`remoteRunId`: Run ID assigned by the remote sandbox environment, if available. `progress`: Optional 0–1 progress fraction reported by the sandbox.

#### SandboxBundleReceived

Emitted when the sandbox returns an output bundle to the orchestrator.

```ts
{
  type: "SandboxBundleReceived",
  runId: string,
  sandboxId: string,
  bundleSizeBytes: number,
  patchCount: number,
  hasOutputs: boolean,
  timestampMs: number,
}
```

`bundleSizeBytes`: Size of the received bundle. `patchCount`: Number of file patches included in the bundle. `hasOutputs`: Whether structured task outputs were included.

#### SandboxCompleted

Emitted when a sandbox execution finishes (regardless of outcome).

```ts
{
  type: "SandboxCompleted",
  runId: string,
  sandboxId: string,
  remoteRunId?: string,
  runtime: "bubblewrap" | "docker" | "codeplane",
  status: "finished" | "failed" | "cancelled",
  durationMs: number,
  timestampMs: number,
}
```

`status`: Final execution status. `durationMs`: Total sandbox execution time.

#### SandboxFailed

Emitted when a sandbox encounters an unrecoverable error.

```ts
{
  type: "SandboxFailed",
  runId: string,
  sandboxId: string,
  runtime: "bubblewrap" | "docker" | "codeplane",
  error: unknown,
  timestampMs: number,
}
```

#### SandboxDiffReviewRequested

Emitted when a sandbox produces patches that require human review before being applied.

```ts
{
  type: "SandboxDiffReviewRequested",
  runId: string,
  sandboxId: string,
  patchCount: number,
  totalDiffLines: number,
  timestampMs: number,
}
```

`patchCount`: Number of patches awaiting review. `totalDiffLines`: Total lines across all diffs.

#### SandboxDiffAccepted

Emitted when a human reviewer accepts the sandbox's proposed patches.

```ts
{
  type: "SandboxDiffAccepted",
  runId: string,
  sandboxId: string,
  patchCount: number,
  timestampMs: number,
}
```

#### SandboxDiffRejected

Emitted when a human reviewer rejects the sandbox's proposed patches.

```ts
{
  type: "SandboxDiffRejected",
  runId: string,
  sandboxId: string,
  reason?: string,
  timestampMs: number,
}
```

`reason`: Optional explanation for the rejection.

### Revert Events

#### RevertStarted

```ts
{
  type: "RevertStarted",
  runId: string,
  nodeId: string,
  iteration: number,
  attempt: number,
  jjPointer: string,
  timestampMs: number,
}
```

#### RevertFinished

```ts
{
  type: "RevertFinished",
  runId: string,
  nodeId: string,
  iteration: number,
  attempt: number,
  jjPointer: string,
  success: boolean,
  error?: string,
  timestampMs: number,
}
```

### Retry / Time-Travel Events

#### RetryTaskStarted

Emitted when a manual or programmatic retry is initiated for a specific task node.

```ts
{
  type: "RetryTaskStarted",
  runId: string,
  nodeId: string,
  iteration: number,
  resetDependents: boolean,
  resetNodes: string[],
  timestampMs: number,
}
```

`resetDependents`: Whether nodes that depend on this task are also being reset. `resetNodes`: Full list of node IDs being cleared as part of this retry.

#### RetryTaskFinished

Emitted when the retry operation completes.

```ts
{
  type: "RetryTaskFinished",
  runId: string,
  nodeId: string,
  iteration: number,
  resetNodes: string[],
  success: boolean,
  error?: string,
  timestampMs: number,
}
```

`resetNodes`: Node IDs that were actually reset. `error`: Set if the retry operation itself failed (not the retried task).

#### TimeTravelStarted

Emitted when a time-travel operation begins, rewinding the run to a prior state.

```ts
{
  type: "TimeTravelStarted",
  runId: string,
  nodeId: string,
  iteration: number,
  attempt: number,
  jjPointer?: string,
  timestampMs: number,
}
```

`jjPointer`: VCS change identifier to restore to, if VCS state is being rewound.

#### TimeTravelFinished

Emitted when the time-travel operation completes.

```ts
{
  type: "TimeTravelFinished",
  runId: string,
  nodeId: string,
  iteration: number,
  attempt: number,
  jjPointer?: string,
  success: boolean,
  vcsRestored: boolean,
  resetNodes: string[],
  error?: string,
  timestampMs: number,
}
```

`vcsRestored`: Whether VCS state was successfully rewound. `resetNodes`: Node IDs that were cleared as part of the rewind. `error`: Set if time-travel failed.

### Voice Events

#### VoiceStarted

Emitted when a voice operation begins.

```ts
{
  type: "VoiceStarted",
  runId: string,
  nodeId: string,
  iteration: number,
  operation: "speak" | "listen",
  provider: string,
  timestampMs: number,
}
```

`operation`: `"speak"` for text-to-speech; `"listen"` for speech-to-text. `provider`: The voice provider in use (e.g. `"openai"`, `"elevenlabs"`).

#### VoiceFinished

Emitted when a voice operation completes successfully.

```ts
{
  type: "VoiceFinished",
  runId: string,
  nodeId: string,
  iteration: number,
  operation: "speak" | "listen",
  provider: string,
  durationMs: number,
  timestampMs: number,
}
```

`durationMs`: Wall time for the voice operation.

#### VoiceError

Emitted when a voice operation fails.

```ts
{
  type: "VoiceError",
  runId: string,
  nodeId: string,
  iteration: number,
  operation: "speak" | "listen",
  provider: string,
  error: unknown,
  timestampMs: number,
}
```

### RAG Events

#### RagIngested

Emitted after documents are chunked and embedded into a vector store namespace.

```ts
{
  type: "RagIngested",
  runId: string,
  documentCount: number,
  chunkCount: number,
  namespace: string,
  timestampMs: number,
}
```

`documentCount`: Number of source documents ingested. `chunkCount`: Number of chunks stored after splitting. `namespace`: The vector store namespace written to.

#### RagRetrieved

Emitted after a semantic search query completes.

```ts
{
  type: "RagRetrieved",
  runId: string,
  query: string,
  resultCount: number,
  namespace: string,
  topScore: number,
  timestampMs: number,
}
```

`query`: The query string submitted. `resultCount`: Number of chunks returned. `topScore`: Similarity score of the highest-ranked result.

### Memory Events

#### MemoryFactSet

Emitted when a key-value fact is written to the memory store.

```ts
{
  type: "MemoryFactSet",
  runId: string,
  namespace: string,
  key: string,
  timestampMs: number,
}
```

`namespace`: Memory namespace the fact belongs to. `key`: Key under which the fact was stored.

#### MemoryRecalled

Emitted when the memory store is queried for relevant facts.

```ts
{
  type: "MemoryRecalled",
  runId: string,
  namespace: string,
  query: string,
  resultCount: number,
  timestampMs: number,
}
```

`query`: The recall query. `resultCount`: Number of facts returned.

#### MemoryMessageSaved

Emitted when a conversation message is persisted to memory.

```ts
{
  type: "MemoryMessageSaved",
  runId: string,
  threadId: string,
  role: string,
  timestampMs: number,
}
```

`threadId`: Identifier of the conversation thread. `role`: Message role (e.g. `"user"`, `"assistant"`).

### OpenAPI Events

#### OpenApiToolCalled

Emitted when a generated OpenAPI tool executes an HTTP operation.

```ts
{
  type: "OpenApiToolCalled",
  runId: string,
  operationId: string,
  method: string,
  path: string,
  durationMs: number,
  status: "success" | "error",
  timestampMs: number,
}
```

`operationId`: The OpenAPI `operationId` of the called operation. `method`: HTTP method (e.g. `"GET"`, `"POST"`). `path`: URL path template. `durationMs`: Round-trip duration. `status`: Whether the HTTP call succeeded or errored.

### Hot Reload

#### WorkflowReloadDetected

```ts
{
  type: "WorkflowReloadDetected",
  runId: string,
  changedFiles: string[],
  timestampMs: number
}
```

#### WorkflowReloaded

```ts
{
  type: "WorkflowReloaded",
  runId: string,
  generation: number,
  changedFiles: string[],
  timestampMs: number
}
```

`generation`: monotonically increasing reload counter.

#### WorkflowReloadFailed

```ts
{
  type: "WorkflowReloadFailed",
  runId: string,
  error: unknown,
  changedFiles: string[],
  timestampMs: number
}
```

The engine continues with the previous valid code.

#### WorkflowReloadUnsafe

```ts
{
  type: "WorkflowReloadUnsafe",
  runId: string,
  reason: string,
  changedFiles: string[],
  timestampMs: number
}
```

Schema changes require a process restart.

### Scorer Events

#### ScorerStarted

Emitted when a scorer begins evaluating a task's output.

```ts
{
  type: "ScorerStarted",
  runId: string,
  nodeId: string,
  scorerId: string,
  scorerName: string,
  timestampMs: number,
}
```

`scorerId`: Unique identifier of the scorer. `scorerName`: Human-readable scorer name.

#### ScorerFinished

Emitted when a scorer completes successfully.

```ts
{
  type: "ScorerFinished",
  runId: string,
  nodeId: string,
  scorerId: string,
  scorerName: string,
  score: number,
  timestampMs: number,
}
```

`score`: The 0–1 normalized score produced by the scorer.

#### ScorerFailed

Emitted when a scorer throws an error during evaluation.

```ts
{
  type: "ScorerFailed",
  runId: string,
  nodeId: string,
  scorerId: string,
  scorerName: string,
  error: unknown,
  timestampMs: number,
}
```

`error`: The error thrown by the scorer. Scorer failures never fail the parent task — they are logged and the workflow continues.

See [Evals & Scorers](/concepts/evals) for the full scoring system documentation.

## Quick Reference

| Event Type | Section | Extra Fields |
|---|---|---|
| `SupervisorStarted` | Supervisor | `pollIntervalMs`, `staleThresholdMs` |
| `SupervisorPollCompleted` | Supervisor | `staleCount`, `resumedCount`, `skippedCount`, `durationMs` |
| `RunStarted` | Run Lifecycle | -- |
| `RunStatusChanged` | Run Lifecycle | `status` |
| `RunFinished` | Run Lifecycle | -- |
| `RunFailed` | Run Lifecycle | `error` |
| `RunCancelled` | Run Lifecycle | -- |
| `RunAutoResumed` | Run Lifecycle | `lastHeartbeatAtMs`, `staleDurationMs` |
| `RunAutoResumeSkipped` | Run Lifecycle | `reason` |
| `RunContinuedAsNew` | Run Lifecycle | `newRunId`, `iteration`, `carriedStateSize`, `ancestryDepth?` |
| `RunForked` | Run Lifecycle | `parentRunId`, `parentFrameNo`, `branchLabel?` |
| `ReplayStarted` | Run Lifecycle | `parentRunId`, `parentFrameNo`, `restoreVcs` |
| `FrameCommitted` | Frame Events | `frameNo`, `xmlHash` |
| `SnapshotCaptured` | Snapshot | `frameNo`, `contentHash` |
| `NodePending` | Node Lifecycle | `nodeId`, `iteration` |
| `NodeStarted` | Node Lifecycle | `nodeId`, `iteration`, `attempt` |
| `NodeFinished` | Node Lifecycle | `nodeId`, `iteration`, `attempt` |
| `NodeFailed` | Node Lifecycle | `nodeId`, `iteration`, `attempt`, `error` |
| `NodeCancelled` | Node Lifecycle | `nodeId`, `iteration`, `attempt?`, `reason?` |
| `NodeSkipped` | Node Lifecycle | `nodeId`, `iteration` |
| `NodeRetrying` | Node Lifecycle | `nodeId`, `iteration`, `attempt` |
| `NodeWaitingApproval` | Node Lifecycle | `nodeId`, `iteration` |
| `NodeWaitingTimer` | Node Lifecycle | `nodeId`, `iteration`, `firesAtMs` |
| `ApprovalRequested` | Approval | `nodeId`, `iteration` |
| `ApprovalGranted` | Approval | `nodeId`, `iteration` |
| `ApprovalAutoApproved` | Approval | `nodeId`, `iteration` |
| `ApprovalDenied` | Approval | `nodeId`, `iteration` |
| `ToolCallStarted` | Tool | `nodeId`, `iteration`, `attempt`, `toolName`, `seq` |
| `ToolCallFinished` | Tool | `nodeId`, `iteration`, `attempt`, `toolName`, `seq`, `status` |
| `NodeOutput` | Output | `nodeId`, `iteration`, `attempt`, `text`, `stream` |
| `TimerCreated` | Timer | `timerId`, `firesAtMs`, `timerType` |
| `TimerFired` | Timer | `timerId`, `firesAtMs`, `firedAtMs`, `delayMs` |
| `TimerCancelled` | Timer | `timerId` |
| `TaskHeartbeat` | Task Heartbeat | `nodeId`, `iteration`, `attempt`, `hasData`, `dataSizeBytes`, `intervalMs?` |
| `TaskHeartbeatTimeout` | Task Heartbeat | `nodeId`, `iteration`, `attempt`, `lastHeartbeatAtMs`, `timeoutMs` |
| `SandboxCreated` | Sandbox | `sandboxId`, `runtime`, `configJson` |
| `SandboxShipped` | Sandbox | `sandboxId`, `runtime`, `bundleSizeBytes` |
| `SandboxHeartbeat` | Sandbox | `sandboxId`, `remoteRunId?`, `progress?` |
| `SandboxBundleReceived` | Sandbox | `sandboxId`, `bundleSizeBytes`, `patchCount`, `hasOutputs` |
| `SandboxCompleted` | Sandbox | `sandboxId`, `remoteRunId?`, `runtime`, `status`, `durationMs` |
| `SandboxFailed` | Sandbox | `sandboxId`, `runtime`, `error` |
| `SandboxDiffReviewRequested` | Sandbox | `sandboxId`, `patchCount`, `totalDiffLines` |
| `SandboxDiffAccepted` | Sandbox | `sandboxId`, `patchCount` |
| `SandboxDiffRejected` | Sandbox | `sandboxId`, `reason?` |
| `RevertStarted` | Revert | `nodeId`, `iteration`, `attempt`, `jjPointer` |
| `RevertFinished` | Revert | `nodeId`, `iteration`, `attempt`, `jjPointer`, `success`, `error?` |
| `RetryTaskStarted` | Retry / Time-Travel | `nodeId`, `iteration`, `resetDependents`, `resetNodes` |
| `RetryTaskFinished` | Retry / Time-Travel | `nodeId`, `iteration`, `resetNodes`, `success`, `error?` |
| `TimeTravelStarted` | Retry / Time-Travel | `nodeId`, `iteration`, `attempt`, `jjPointer?` |
| `TimeTravelFinished` | Retry / Time-Travel | `nodeId`, `iteration`, `attempt`, `jjPointer?`, `success`, `vcsRestored`, `resetNodes`, `error?` |
| `VoiceStarted` | Voice | `nodeId`, `iteration`, `operation`, `provider` |
| `VoiceFinished` | Voice | `nodeId`, `iteration`, `operation`, `provider`, `durationMs` |
| `VoiceError` | Voice | `nodeId`, `iteration`, `operation`, `provider`, `error` |
| `RagIngested` | RAG | `documentCount`, `chunkCount`, `namespace` |
| `RagRetrieved` | RAG | `query`, `resultCount`, `namespace`, `topScore` |
| `MemoryFactSet` | Memory | `namespace`, `key` |
| `MemoryRecalled` | Memory | `namespace`, `query`, `resultCount` |
| `MemoryMessageSaved` | Memory | `threadId`, `role` |
| `OpenApiToolCalled` | OpenAPI | `operationId`, `method`, `path`, `durationMs`, `status` |
| `AgentEvent` | Output | `nodeId`, `iteration`, `attempt`, `engine`, `event` |
| `WorkflowReloadDetected` | Hot Reload | `changedFiles` |
| `WorkflowReloaded` | Hot Reload | `generation`, `changedFiles` |
| `WorkflowReloadFailed` | Hot Reload | `error`, `changedFiles` |
| `WorkflowReloadUnsafe` | Hot Reload | `reason`, `changedFiles` |
| `RunHijackRequested` | Run Lifecycle | `target?` |
| `RunHijacked` | Run Lifecycle | `nodeId`, `iteration`, `attempt`, `engine`, `mode`, `resume?`, `cwd` |
| `ScorerStarted` | Scorer | `nodeId`, `scorerId`, `scorerName` |
| `ScorerFinished` | Scorer | `nodeId`, `scorerId`, `scorerName`, `score` |
| `ScorerFailed` | Scorer | `nodeId`, `scorerId`, `scorerName`, `error` |
| `TokenUsageReported` | Output | `nodeId`, `iteration`, `attempt`, `model`, `agent`, `inputTokens`, `outputTokens`, `cacheReadTokens?`, `cacheWriteTokens?`, `reasoningTokens?` |

## Persistence

Events are persisted in two places:

1. **SQLite** -- `_smithers_events` table with sequential `seq` number. Source of truth.
2. **NDJSON** -- `stream.ndjson` in the run's log directory. Best-effort.

Both are asynchronous. `onProgress` fires synchronously before persistence.

## Related

- [runWorkflow](/runtime/run-workflow) -- Where `onProgress` is configured.
- [Monitoring and Logs](/guides/monitoring-logs) -- Practical monitoring guide.
- [CLI](/cli/overview) -- View run status and frames.

---

## Revert to Attempt

> Rewind the workspace to a previous task attempt's state using JJ (Jujutsu) snapshots.
> Source: https://smithers.sh/runtime/revert

Smithers records a [JJ (Jujutsu)](https://jj-vcs.github.io/jj/) change ID after each successful task attempt. Revert restores the workspace to the exact filesystem state at that point.

## Prerequisites

- JJ installed and in `PATH` (`brew install jj` or `cargo install jj-cli`)
- Workspace is a JJ repository (`jj git init` or `jj init`)
- Target attempt has a recorded JJ pointer (captured only when JJ was available at attempt completion)

## How It Works

1. On successful attempt completion, Smithers captures the JJ change ID via `jj log -r @ --no-graph --template change_id`.
2. The change ID is stored in `_smithers_attempts.jj_pointer`.
3. `smithers revert` runs `jj restore --from <change_id>` to restore the workspace.
4. Database frames recorded after the attempt started are deleted so the DB state matches the reverted filesystem.

Revert restores files but does not alter JJ history. It creates a new change on top of the current working copy.

## CLI Usage

```bash
smithers revert <workflow.tsx> \
  --run-id <run-id> \
  --node-id <node-id> \
  [--attempt N] \
  [--iteration N]
```

### Flags

| Flag | Default | Description |
|---|---|---|
| `--run-id ID` | required | Run containing the target attempt. |
| `--node-id ID` | required | Task node to revert to. |
| `--attempt N` | `1` | Attempt number (1-indexed). |
| `--iteration N` | `0` | Loop iteration number. |

### Examples

```bash
# Revert to first attempt of "analyze"
smithers revert workflow.tsx --run-id abc123 --node-id analyze

# Revert to second attempt (after retry)
smithers revert workflow.tsx --run-id abc123 --node-id analyze --attempt 2

# Revert to a specific loop iteration
smithers revert workflow.tsx --run-id abc123 --node-id fix --attempt 1 --iteration 2
```

### Exit Codes

| Code | Meaning |
|---|---|
| `0` | Revert succeeded. |
| `1` | Revert failed. |

### Output

```json
{ "success": true, "jjPointer": "zxkqmrstvwxy" }
```

On failure:

```json
{ "success": false, "error": "Attempt has no jjPointer recorded", "jjPointer": null }
```

`RevertStarted` and `RevertFinished` events are printed as JSON lines to stdout during execution.

## Programmatic Usage

```ts
import { SmithersDb, ensureSmithersTables, revertToAttempt } from "smithers-orchestrator";

const adapter = new SmithersDb(db);

const result = await revertToAttempt(adapter, {
  runId: "abc123",
  nodeId: "analyze",
  iteration: 0,
  attempt: 1,
  onProgress: (event) => console.log(event.type),
});

if (result.success) {
  console.log("Reverted to", result.jjPointer);
} else {
  console.error("Revert failed:", result.error);
}
```

### RevertOptions

```ts
type RevertOptions = {
  runId: string;
  nodeId: string;
  iteration: number;
  attempt: number;
  onProgress?: (event: SmithersEvent) => void;
};
```

### RevertResult

```ts
type RevertResult = {
  success: boolean;
  error?: string;
  jjPointer?: string;
};
```

## Events

- **`RevertStarted`** -- Before `jj restore` runs. Includes `jjPointer`.
- **`RevertFinished`** -- After restore completes. Includes `success` and `error` (if failed).

See [Events](/runtime/events#revert-events) for full type definitions.

## Troubleshooting

| Error | Cause |
|---|---|
| "Attempt has no jjPointer recorded" | JJ was not available when the attempt finished. Pointers are captured opportunistically. |
| "jj exited with code 1" | Change ID pruned/GC'd, workspace conflicted, or JJ misconfigured. |

Revert restores the filesystem **and** cleans up database frames recorded after the reverted attempt started. Task outputs, attempt records, and run state remain unchanged. To re-run a task, resume the workflow.

## Related

- [Events](/runtime/events) -- RevertStarted and RevertFinished types.
- [VCS Integration](/guides/vcs) -- Version control integration.
- [CLI](/cli/overview) -- Full CLI reference.

---

## Hello World

> A minimal workflow with a single agent task that generates a greeting.
> Source: https://smithers.sh/examples/hello-world

# Hello World

```tsx
/** @jsxImportSource smithers-orchestrator */
// hello-world.tsx
import { createSmithers, Task, Sequence } from "smithers-orchestrator";
import { ToolLoopAgent as Agent } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
import { z } from "zod";

const { Workflow, smithers, outputs } = createSmithers({
  greeting: z.object({
    message: z.string(),
  }),
});

const greeter = new Agent({
  model: anthropic("claude-sonnet-4-5-20250929"),
  instructions: "You are a friendly greeter. Respond with a short, warm greeting.",
});

export default smithers((ctx) => (
  <Workflow name="hello-world">
    <Sequence>
      <Task id="greet" output={outputs.greeting} agent={greeter}>
        Generate a warm greeting for someone named Alice.
      </Task>
    </Sequence>
  </Workflow>
));
```

```bash
bunx smithers-orchestrator up hello-world.tsx --input '{}'
```

```
[hello-world] Starting run abc123
[greet] Running...
[greet] Done -> { message: "Hello Alice! Welcome — it's wonderful to have you here!" }
[hello-world] Completed
```

`createSmithers` registers a `greeting` table with a `message` field. `Task` sends the prompt to the agent and persists structured output. Every task output is stored in the database, so the workflow is resumable -- if it crashes after `greet` completes, re-running skips to the end.

---

## Approval Gate

> A workflow with a human-in-the-loop approval step that pauses execution until a human approves or denies.
> Source: https://smithers.sh/examples/approval-gate

# Approval Gate

`<Approval>` pauses a workflow at an explicit node, waits for a human decision, then continues.

## Workflow Definition

```tsx
/** @jsxImportSource smithers-orchestrator */
// approval-gate.tsx
import {
  Approval,
  Sequence,
  Task,
  approvalDecisionSchema,
  createSmithers,
} from "smithers-orchestrator";
import { ToolLoopAgent as Agent } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
import { z } from "zod";

const { Workflow, smithers, outputs } = createSmithers({
  draft: z.object({
    title: z.string(),
    content: z.string(),
  }),
  publishApproval: approvalDecisionSchema,
  published: z.object({
    url: z.string(),
    publishedAt: z.string(),
  }),
});

const writer = new Agent({
  model: anthropic("claude-sonnet-4-5-20250929"),
  instructions:
    "You are a technical writer. Draft a blog post with a title and full content based on the given topic.",
});

const publisher = new Agent({
  model: anthropic("claude-sonnet-4-5-20250929"),
  instructions:
    "You are a publishing agent. Take the approved draft and return a URL and timestamp for the published post.",
});

export default smithers((ctx) => {
  const draft = ctx.outputMaybe(outputs.draft, { nodeId: "write-draft" });
  const decision = ctx.outputMaybe(outputs.publishApproval, {
    nodeId: "approve-publish",
  });

  return (
    <Workflow name="approval-gate">
      <Sequence>
        <Task id="write-draft" output={outputs.draft} agent={writer}>
          Write a blog post about deterministic AI workflows and why resumability
          matters for production systems.
        </Task>

        <Approval
          id="approve-publish"
          output={outputs.publishApproval}
          request={{
            title: "Publish blog post",
            summary: draft
              ? `Publish "${draft.title}" to the public site.`
              : "Publish the current draft.",
          }}
        />

        {decision?.approved ? (
          <Task id="publish" output={outputs.published} agent={publisher}>
            Publish this approved draft:{"\n\n"}
            Title: {draft?.title}
            {"\n\n"}
            {draft?.content}
          </Task>
        ) : null}
      </Sequence>
    </Workflow>
  );
});
```

## Running

```bash
bunx smithers-orchestrator up approval-gate.tsx --input '{}'
```

```
[approval-gate] Starting run mno345
[write-draft] Done -> { title: "Why Resumability Matters", content: "In production AI systems..." }
[approve-publish] Waiting for approval...
[approval-gate] Paused — run `bunx smithers-orchestrator approve` or `bunx smithers-orchestrator deny` to continue.
```

## Approving or Denying

Approve and resume:

```bash
bunx smithers-orchestrator approve mno345 --node approve-publish
bunx smithers-orchestrator up approval-gate.tsx --run-id mno345 --resume true
```

```
[approve-publish] Approved.
[publish] Running...
[publish] Done -> { url: "https://blog.example.com/resumability", publishedAt: "2026-02-10T12:00:00Z" }
[approval-gate] Completed
```

Deny and halt:

```bash
bunx smithers-orchestrator deny mno345 --node approve-publish
```

```
[approve-publish] Denied.
[approval-gate] Halted at node "approve-publish" (denied by user).
```

## Listing Pending Approvals

```bash
bunx smithers-orchestrator ps --status waiting-approval
```

```json
{
  "runs": [
    {
      "id": "mno345",
      "workflow": "approval-gate",
      "status": "waiting-approval",
      "step": "approve-publish",
      "started": "2m ago"
    }
  ]
}
```

## How It Works

- `<Approval>` persists a decision object (`approved`, `note`, `decidedBy`, `decidedAt`) when the workflow resumes. The audit timestamp itself lives in Smithers' approval records and event log, so `decidedAt` remains deterministic in durable outputs.
- Re-running after approval replays completed tasks from the database and continues from the approval point.
- Denial is permanent for that run. To retry, start a new run.

---

## Tools Agent

> An agent that uses built-in tools (read, grep, bash) to search a codebase and return structured results.
> Source: https://smithers.sh/examples/tools-agent

# Tools Agent

An agent with filesystem and shell tools for codebase analysis, log searching, or automated refactoring.

## Workflow Definition

```tsx
/** @jsxImportSource smithers-orchestrator */
// tools-agent.tsx
import { createSmithers, Task, Sequence } from "smithers-orchestrator";
import { ToolLoopAgent as Agent } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
import { tools } from "smithers-orchestrator";
import { z } from "zod";

const { Workflow, smithers, outputs } = createSmithers({
  searchResult: z.object({
    matches: z.array(
      z.object({
        file: z.string(),
        line: z.number(),
        content: z.string(),
      })
    ),
    summary: z.string(),
    recommendation: z.string(),
  }),
});

const codeSearchAgent = new Agent({
  model: anthropic("claude-sonnet-4-5-20250929"),
  instructions: `You are a codebase analysis agent. Use the provided tools to search
through source code and answer questions. Always back up your findings with
specific file paths and line numbers.`,
  tools,
});

export default smithers((ctx) => (
  <Workflow name="tools-agent">
    <Sequence>
      <Task
        id="search"
        output={outputs.searchResult}
        agent={codeSearchAgent}
        timeoutMs={60_000}
        retries={2}
      >
        Search the current repository for all usages of deprecated API calls
        matching the pattern "legacyAuth". For each match, record the file path,
        line number, and the matching line content. Then provide a summary of how
        widespread the usage is and a recommendation for migration.
      </Task>
    </Sequence>
  </Workflow>
));
```

## Running

```bash
bunx smithers-orchestrator up tools-agent.tsx --input '{}'
```

```
[tools-agent] Starting run pqr678
[search] Running...
  [tool:grep] pattern="legacyAuth" path="." -> 4 matches
  [tool:read] file="src/auth/login.ts" lines=42-50
  [tool:read] file="src/middleware/session.ts" lines=18-25
[search] Done -> {
  matches: [
    { file: "src/auth/login.ts", line: 45, content: "const session = legacyAuth.createSession(user);" },
    { file: "src/auth/login.ts", line: 48, content: "legacyAuth.setToken(session.token);" },
    { file: "src/middleware/session.ts", line: 20, content: "if (legacyAuth.verify(token)) {" },
    { file: "src/middleware/session.ts", line: 23, content: "legacyAuth.refresh(token);" }
  ],
  summary: "4 usages across 2 files (auth/login.ts and middleware/session.ts).",
  recommendation: "Replace legacyAuth with the new AuthService class. Start with session.ts since it has fewer call sites."
}
[tools-agent] Completed
```

## Built-in Tools

```tsx
/** @jsxImportSource smithers-orchestrator */
import { tools } from "smithers-orchestrator";

// Or individually:
import { read, write, edit, grep, bash } from "smithers-orchestrator";
```

| Tool | Description |
|------|-------------|
| `read` | Read file contents by path. Supports line ranges. |
| `write` | Write content to a file. Creates if absent. |
| `edit` | Search-and-replace edits. Safer than full rewrites. |
| `grep` | Regex search over file contents. Returns files, lines, context. |
| `bash` | Execute shell commands. |

## Robustness Props

| Prop | Value | Purpose |
|------|-------|---------|
| `timeoutMs` | `60_000` | Kill runaway tool loops after 60s. |
| `retries` | `2` | Retry on failure (e.g., grep typo on first pass). |

```tsx
/** @jsxImportSource smithers-orchestrator */
<Task
  id="search"
  output={outputs.searchResult}
  agent={codeSearchAgent}
  timeoutMs={60_000}
  retries={2}
  continueOnFail  // workflow continues even if retries exhausted
  meta={{ pattern: "legacyAuth" }}  // arbitrary metadata stored with the task
>
```

---

## Multi-Agent Review

> Two reviewer agents run in parallel, then a third task aggregates their results.
> Source: https://smithers.sh/examples/multi-agent-review

# Multi-Agent Review

Two reviewers run concurrently via `<Parallel>`, then an aggregator produces a final verdict.

## Workflow Definition

```tsx
/** @jsxImportSource smithers-orchestrator */
// multi-agent-review.tsx
import { createSmithers, Task, Sequence, Parallel } from "smithers-orchestrator";
import { ToolLoopAgent as Agent } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
import { z } from "zod";

const { Workflow, smithers, outputs } = createSmithers({
  review: z.object({
    approved: z.boolean(),
    feedback: z.string(),
  }),
  verdict: z.object({
    approved: z.boolean(),
    summary: z.string(),
  }),
});

const securityReviewer = new Agent({
  model: anthropic("claude-sonnet-4-5-20250929"),
  instructions:
    "You are a security-focused code reviewer. Look for vulnerabilities, injection risks, and auth issues. Return your verdict and detailed feedback.",
});

const qualityReviewer = new Agent({
  model: anthropic("claude-sonnet-4-5-20250929"),
  instructions:
    "You are a code quality reviewer. Evaluate readability, test coverage, error handling, and adherence to best practices. Return your verdict and detailed feedback.",
});

const aggregator = new Agent({
  model: anthropic("claude-sonnet-4-5-20250929"),
  instructions:
    "You receive two code reviews. Synthesize them into a single verdict. Approve only if both reviewers approve.",
});

export default smithers((ctx) => {
  const secReview = ctx.outputMaybe("review", { nodeId: "security-review" });
  const qualReview = ctx.outputMaybe("review", { nodeId: "quality-review" });

  return (
    <Workflow name="multi-agent-review">
      <Sequence>
        <Parallel maxConcurrency={2}>
          <Task id="security-review" output={outputs.review} agent={securityReviewer}>
            Review this PR diff for security issues:{"\n\n"}
            ```diff{"\n"}- const token = req.query.token;{"\n"}+ const token =
            sanitize(req.headers.authorization);{"\n"}```
          </Task>

          <Task id="quality-review" output={outputs.review} agent={qualityReviewer}>
            Review this PR diff for code quality:{"\n\n"}
            ```diff{"\n"}- const token = req.query.token;{"\n"}+ const token =
            sanitize(req.headers.authorization);{"\n"}```
          </Task>
        </Parallel>

        <Task id="aggregate" output={outputs.verdict} agent={aggregator}>
          Combine these two reviews into a final verdict:{"\n\n"}
          Security review: {secReview?.approved ? "APPROVED" : "REJECTED"} -{" "}
          {secReview?.feedback}
          {"\n\n"}
          Quality review: {qualReview?.approved ? "APPROVED" : "REJECTED"} -{" "}
          {qualReview?.feedback}
        </Task>
      </Sequence>
    </Workflow>
  );
});
```

## Running

```bash
bunx smithers-orchestrator up multi-agent-review.tsx --input '{}'
```

```
[multi-agent-review] Starting run jkl012
[security-review] Running...
[quality-review] Running...
[security-review] Done -> { approved: true, feedback: "Good: moved token from query to header, added sanitization." }
[quality-review] Done -> { approved: false, feedback: "Missing error handling if authorization header is absent." }
[aggregate] Done -> { approved: false, summary: "Security looks good, but quality reviewer flagged missing null check on header." }
[multi-agent-review] Completed
```

## How Parallel Works

- All children of `<Parallel>` start at the same time.
- `maxConcurrency` limits simultaneous tasks. If omitted, all run at once.
- The `Sequence` waits for all parallel tasks to finish before continuing.
- Tasks sharing the same output table are disambiguated by `nodeId`.

Retrieve each result with `ctx.outputMaybe(schemaKey, { nodeId })`:

```tsx
/** @jsxImportSource smithers-orchestrator */
const secReview = ctx.outputMaybe("review", { nodeId: "security-review" });
const qualReview = ctx.outputMaybe("review", { nodeId: "quality-review" });
```

Both return `undefined` until their respective tasks complete.

---

## Dynamic Plan

> A workflow that analyzes input and branches between a simple or complex execution path based on the analysis result.
> Source: https://smithers.sh/examples/dynamic-plan

# Dynamic Plan

`<Branch>` chooses between execution paths at runtime. Here, an analyzer classifies task complexity and routes to either a quick fix or a multi-step plan.

## Workflow Definition

```tsx
// dynamic-plan.tsx
import { createSmithers, Task, Sequence, Branch } from "smithers-orchestrator";
import { ToolLoopAgent as Agent } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
import { z } from "zod";

const { Workflow, smithers, outputs } = createSmithers({
  analysis: z.object({
    summary: z.string(),
    complexity: z.enum(["low", "high"]),
  }),
  plan: z.object({
    steps: z.array(z.string()),
  }),
  result: z.object({
    output: z.string(),
  }),
});

const analyzer = new Agent({
  model: anthropic("claude-sonnet-4-5-20250929"),
  instructions:
    "Analyze the given task. Determine if it is low or high complexity. Return a short summary and a complexity rating.",
});

const planner = new Agent({
  model: anthropic("claude-sonnet-4-5-20250929"),
  instructions: "Break the task into concrete, ordered steps.",
});

const implementer = new Agent({
  model: anthropic("claude-sonnet-4-5-20250929"),
  instructions: "Implement the requested task and return the result.",
});

export default smithers((ctx) => {
  const analysis = ctx.outputMaybe("analysis", { nodeId: "analyze" });
  const isComplex = analysis?.complexity === "high";

  return (
    <Workflow name="dynamic-plan">
      <Sequence>
        <Task id="analyze" output={outputs.analysis} agent={analyzer}>
          Analyze this task and classify its complexity: "Refactor the authentication
          module to support OAuth2 and SAML providers."
        </Task>

        <Branch
          if={isComplex}
          then={
            <Sequence>
              <Task id="plan" output={outputs.plan} agent={planner}>
                Create a step-by-step plan for: {analysis?.summary}
              </Task>
              <Task id="implement" output={outputs.result} agent={implementer}>
                Execute these steps:{" "}
                {ctx
                  .outputMaybe("plan", { nodeId: "plan" })
                  ?.steps.join(", ")}
              </Task>
            </Sequence>
          }
          else={
            <Task id="implement" output={outputs.result} agent={implementer}>
              Quick implementation for: {analysis?.summary}
            </Task>
          }
        />
      </Sequence>
    </Workflow>
  );
});
```

## Running

```bash
smithers up dynamic-plan.tsx --input '{}'
```

High complexity path:

```
[dynamic-plan] Starting run def456
[analyze] Done -> { summary: "Refactor auth to support OAuth2 + SAML", complexity: "high" }
[plan] Done -> { steps: ["Abstract provider interface", "Implement OAuth2", "Implement SAML", "Add tests"] }
[implement] Done -> { output: "Refactored auth module with provider abstraction..." }
[dynamic-plan] Completed
```

Low complexity path (planner skipped):

```
[analyze] Done -> { summary: "Minor auth tweak", complexity: "low" }
[implement] Done -> { output: "Applied quick fix..." }
```

## How Branch Works

- `if` is evaluated each time the workflow re-renders. Only the matching branch (`then` or `else`) is mounted.
- The Branch is inside a `Sequence`, so it is not reached until `analyze` finishes and `analysis` is populated.
- Resumable: completed task outputs are persisted, so re-running picks up where it left off.

---

## Loop

> An iterative review loop where an agent writes code and a reviewer evaluates it until approved or max iterations.
> Source: https://smithers.sh/examples/loop

# Loop

`<Loop>` re-executes its children until a condition is met or a maximum iteration count is reached.

## Workflow Definition

```tsx
/** @jsxImportSource smithers-orchestrator */
// review-loop.tsx
import { createSmithers, Task, Sequence, Loop } from "smithers-orchestrator";
import { ToolLoopAgent as Agent } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
import { z } from "zod";

const { Workflow, smithers, outputs } = createSmithers({
  code: z.object({
    source: z.string(),
    language: z.string(),
  }),
  review: z.object({
    approved: z.boolean(),
    feedback: z.string(),
  }),
  finalOutput: z.object({
    source: z.string(),
    iterations: z.number(),
  }),
});

const coder = new Agent({
  model: anthropic("claude-sonnet-4-5-20250929"),
  instructions:
    "You are an expert programmer. Write or revise code based on the given requirements and feedback.",
});

const reviewer = new Agent({
  model: anthropic("claude-sonnet-4-5-20250929"),
  instructions:
    "You are a strict code reviewer. Evaluate the code for correctness, style, and edge cases. Set approved to true only if the code is production-ready.",
});

export default smithers((ctx) => {
  const latestReview = ctx.outputMaybe("review", { nodeId: "review" });
  const latestCode = ctx.outputMaybe("code", { nodeId: "write" });

  return (
    <Workflow name="review-loop">
      <Sequence>
        <Loop
          id="revision-loop"
          until={latestReview?.approved === true}
          maxIterations={5}
          onMaxReached="return-last"
        >
          <Sequence>
            <Task id="write" output={outputs.code} agent={coder}>
              Write a TypeScript function that debounces an input function.
              {latestReview
                ? ` Revise based on this feedback: ${latestReview.feedback}`
                : ""}
            </Task>

            <Task id="review" output={outputs.review} agent={reviewer}>
              Review this code for correctness and edge cases:
              {"\n\n```" + (latestCode?.language ?? "ts") + "\n"}
              {latestCode?.source ?? "// no code yet"}
              {"\n```"}
            </Task>
          </Sequence>
        </Loop>

        <Task id="final" output={outputs.finalOutput}>
          {{
            source: latestCode?.source ?? "",
            iterations: ctx.iterationCount("code", "write"),
          }}
        </Task>
      </Sequence>
    </Workflow>
  );
});
```

## Running

```bash
bunx smithers-orchestrator up review-loop.tsx --input '{}'
```

```
[review-loop] Starting run ghi789
[revision-loop] Iteration 1
  [write] Done -> { source: "function debounce(fn, ms) { ... }", language: "ts" }
  [review] Done -> { approved: false, feedback: "Missing generic types; no cancel method." }
[revision-loop] Iteration 2
  [write] Done -> { source: "function debounce<T>(fn: T, ms: number) { ... cancel() ... }", language: "ts" }
  [review] Done -> { approved: true, feedback: "Looks good. Generics and cancel are correct." }
[final] Done -> { source: "function debounce<T>(...) { ... }", iterations: 2 }
[review-loop] Completed
```

## Loop Props

| Prop | Description |
|------|-------------|
| `id` | Unique identifier for the loop node. |
| `until` | Boolean expression. When `true`, the loop stops. |
| `maxIterations` | Safety cap on iterations (default: 5). |
| `onMaxReached` | `"fail"` throws an error; `"return-last"` exits with the last output. |

## Context Methods

- `ctx.outputMaybe(schemaKey, { nodeId })` returns the latest value from the most recent iteration. The first argument is the schema key from `createSmithers`, not a table name.
- `ctx.iterationCount(schemaKey, nodeId)` returns how many times a task has executed.
- `ctx.latest(schemaKey, nodeId)` always returns the highest-iteration row. Inside loops, this is often more convenient than `ctx.outputMaybe`.

All intermediate outputs are persisted. If the workflow crashes mid-iteration, it restarts from the last incomplete task.

## Re-render Cycle

1. The builder function `(ctx) => (...)` runs on every render frame.
2. First render: `ctx.outputMaybe("review", ...)` returns `undefined`. The write task produces an initial draft.
3. After both tasks complete, the renderer persists outputs and re-renders.
4. Next render: `latestReview` is populated. The loop evaluates `until`. If not approved, the body executes again with the review feedback.
5. Repeats until approved or `maxIterations` is reached.

---

## Ghost: workflows/hello.tsx

> Example from workflows/hello.tsx — A minimal hello-world workflow using literal output (no agent) with deterministic persistence.
> Source: https://smithers.sh/examples/workflow-hello

# workflows/hello.tsx

> **Note:** **Ghost doc** -- Real script from `workflows/hello.tsx`. Demonstrates literal output with no AI agent.


## Source

```tsx
/** @jsxImportSource smithers-orchestrator */
// workflows/hello.tsx
import { createSmithers, Workflow, Task } from "smithers-orchestrator";
import { z } from "zod";

const { smithers, outputs } = createSmithers({
  output: z.object({
    message: z.string(),
    length: z.number(),
  }),
});

export default smithers((ctx) => (
  <Workflow name="hello">
    <Task id="hello" output={outputs.output}>
      {{
        message: `Hello, ${ctx.input.name}!`,
        length: ctx.input.name.length,
      }}
    </Task>
  </Workflow>
));
```

## Running

```bash
bunx smithers-orchestrator up workflows/hello.tsx --input '{"name": "World"}'
```

```
[hello] Starting run abc123
[hello] Done -> { message: "Hello, World!", length: 5 }
[hello] Completed
```

## Notes

- **Literal output** -- `Task` receives a plain object instead of an agent prompt. No LLM call; deterministic output.
- **`ctx.input`** -- Access the CLI input payload passed via `--input`.
- **Resumable** -- Output is persisted to SQLite. Re-running after a crash skips completed tasks.

---

## Ghost: workflows/approval.tsx

> Example from workflows/approval.tsx — A two-step sequential workflow with a human approval gate before the final task.
> Source: https://smithers.sh/examples/workflow-approval

# workflows/approval.tsx

> **Note:** **Ghost doc** -- Real script from `workflows/approval.tsx`. Demonstrates `needsApproval` for human-in-the-loop workflows.


## Source

```tsx
/** @jsxImportSource smithers-orchestrator */
// workflows/approval.tsx
import { createSmithers } from "smithers-orchestrator";
import { z } from "zod";

const { Workflow, Sequence, Task, smithers, outputs } = createSmithers({
  input: z.object({
    name: z.string(),
  }),
  output: z.object({
    message: z.string(),
    length: z.number(),
  }),
});

export default smithers((ctx) => (
  <Workflow name="approval">
    <Sequence>
      <Task id="approve" output={outputs.output} needsApproval>
        {{
          message: `Approved: ${ctx.input.name}`,
          length: ctx.input.name.length,
        }}
      </Task>
      <Task id="final" output={outputs.output}>
        {{
          message: `Done: ${ctx.input.name}`,
          length: ctx.input.name.length,
        }}
      </Task>
    </Sequence>
  </Workflow>
));
```

## Running

```bash
bunx smithers-orchestrator up workflows/approval.tsx --input '{"name": "Deploy v2"}'
```

The workflow pauses at the approval gate:

```
[approval] Starting run mno345
[approve] Waiting for approval...
[approval] Paused — run `bunx smithers-orchestrator approve` or `bunx smithers-orchestrator deny` to continue.
```

Approve and resume:

```bash
bunx smithers-orchestrator approve mno345 --node approve
bunx smithers-orchestrator up workflows/approval.tsx --run-id mno345 --resume true
```

```
[approve] Approved. Running...
[approve] Done -> { message: "Approved: Deploy v2", length: 9 }
[final] Done -> { message: "Done: Deploy v2", length: 9 }
[approval] Completed
```

## Notes

- **`needsApproval`** -- Pauses execution until a human approves. Core human-in-the-loop primitive.
- **Resumable** -- Workflow state is persisted to SQLite. Approval is recorded durably; `--resume true` continues from the gate.

---

## Workflow Quickstart

> A two-agent sequential smithers-orchestrator workflow where a planner task feeds a briefing task through persisted structured output.
> Source: https://smithers.sh/examples/workflow-quickstart

# Workflow Quickstart

> **Note:** Standalone `smithers-orchestrator` quickstart example. A planner task feeds a briefing task through persisted workflow output.


## Source

```tsx
/** @jsxImportSource smithers-orchestrator */
// workflows/quickstart.tsx
import { createSmithers } from "smithers-orchestrator";
import { ToolLoopAgent as Agent, Output } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
import { z } from "zod";

const planSchema = z.object({
  summary: z.string(),
  steps: z.array(z.string()).min(3).max(8),
});

const briefSchema = z.object({
  brief: z.string(),
  stepCount: z.number().int().min(1),
});

const { Workflow, Sequence, Task, smithers, outputs } = createSmithers({
  plan: planSchema,
  brief: briefSchema,
});

const planAgent = new Agent({
  model: anthropic("claude-sonnet-4-5-20250929"),
  output: Output.object({ schema: planSchema }),
  instructions:
    "You are a planning assistant. Return a concise summary and 3-8 actionable steps.",
});

const briefAgent = new Agent({
  model: anthropic("claude-sonnet-4-5-20250929"),
  output: Output.object({ schema: briefSchema }),
  instructions:
    "You are a concise technical writer. Produce a 5-8 sentence brief.",
});

export default smithers((ctx) => {
  const planOutput = ctx.outputMaybe(outputs.plan, { nodeId: "plan" });

  return (
    <Workflow name="quickstart">
      <Sequence>
        <Task id="plan" output={outputs.plan} agent={planAgent}>
          {`Create a short plan for this goal:\n${ctx.input.goal}`}
        </Task>
        <Task id="brief" output={outputs.brief} agent={briefAgent}>
          {`Goal: ${ctx.input.goal}
Plan summary: ${planOutput?.summary ?? "pending"}
Steps: ${JSON.stringify(planOutput?.steps ?? [])}

Write a brief based on the plan. The "stepCount" must equal the number of steps.`}
        </Task>
      </Sequence>
    </Workflow>
  );
});
```

## Running

```bash
bunx smithers-orchestrator up workflows/quickstart.tsx --input '{"goal": "Build a CLI tool for managing dotfiles"}'
```

```
[quickstart] Starting run def456
[plan] Done -> { summary: "Build a dotfile manager CLI", steps: ["Parse config", "Symlink files", "Add backup"] }
[brief] Done -> { brief: "This plan covers 3 steps...", stepCount: 3 }
[quickstart] Completed
```

## Notes

- **Cross-task data flow** -- `ctx.outputMaybe(outputs.plan, { nodeId: "plan" })` reads the planner's persisted output. `ctx.outputMaybe` accepts either a schema key string like `"plan"` or the schema object from `outputs`.
- **Shared schemas** -- The same Zod schemas are reused by `createSmithers(...)` and `Output.object({ schema })`, so task persistence and agent output stay aligned.
- **Vercel AI SDK** -- `ToolLoopAgent` and `Output` are exported from `ai`, and the agent constructor accepts `{ model, output, instructions }`.

---

## Ghost: Worktree Feature Workflow

> Example from scripts/worktree-feature/ — A production multi-agent pipeline that discovers tickets from a PRD, implements them with Claude/Codex, validates, reviews in parallel, and generates reports.
> Source: https://smithers.sh/examples/worktree-feature-workflow

# scripts/worktree-feature/ — Full Pipeline

> **Note:** **Ghost doc** — Real production workflow at `scripts/worktree-feature/`. The most complex Smithers example: multiple CLI agents (Claude Code, OpenAI Codex) through a full development lifecycle.


## Pipeline

1. **Discover** — Read PRD, break into ordered independent tickets
2. **Implement** — Write code end-to-end per ticket
3. **Validate** — Run `bun test`
4. **Review** — Claude + Codex review in parallel
5. **ReviewFix** — Address review issues
6. **Report** — Generate final report

Steps 2--5 loop via `<Loop>` until both reviewers approve or max iterations reached.

## Schema Setup — smithers.ts

```tsx
// scripts/worktree-feature/smithers.ts
import { createSmithers } from "smithers-orchestrator";
import { z } from "zod";

// Each pipeline stage gets its own Zod output schema
const DiscoverOutput = z.object({
  tickets: z.array(z.object({
    id: z.string(),
    title: z.string(),
    description: z.string(),
    acceptanceCriteria: z.array(z.string()),
    filesToModify: z.array(z.string()),
    filesToCreate: z.array(z.string()),
    dependencies: z.array(z.string()).nullable(),
  })),
  reasoning: z.string(),
});

const ImplementOutput = z.object({
  filesCreated: z.array(z.string()).nullable(),
  filesModified: z.array(z.string()).nullable(),
  whatWasDone: z.string(),
  allTestsPassing: z.boolean(),
  testOutput: z.string(),
});

const ValidateOutput = z.object({
  allPassed: z.boolean(),
  failingSummary: z.string().nullable(),
});

const ReviewOutput = z.object({
  reviewer: z.string(),
  approved: z.boolean(),
  issues: z.array(z.object({
    severity: z.enum(["critical", "major", "minor", "nit"]),
    file: z.string(),
    line: z.number().nullable(),
    description: z.string(),
    suggestion: z.string().nullable(),
  })),
  feedback: z.string(),
});

const ReviewFixOutput = z.object({
  fixesMade: z.array(z.object({ issue: z.string(), fix: z.string(), file: z.string() })),
  allIssuesResolved: z.boolean(),
  summary: z.string(),
});

const ReportOutput = z.object({
  ticketTitle: z.string(),
  status: z.enum(["completed", "partial", "failed"]),
  summary: z.string(),
  filesChanged: z.number(),
  reviewRounds: z.number(),
});

export const { Workflow, Task, useCtx, smithers, tables, outputs } = createSmithers({
  discover: DiscoverOutput,
  implement: ImplementOutput,
  validate: ValidateOutput,
  review: ReviewOutput,
  reviewFix: ReviewFixOutput,
  report: ReportOutput,
}, {
  dbPath: `${process.env.HOME}/.cache/smithers/worktree-feature.db`,
  journalMode: "DELETE",
});
```

## Entry Point — workflow.tsx

```tsx
// scripts/worktree-feature/workflow.tsx
import { Sequence, Branch } from "smithers-orchestrator";
import { Discover, TicketPipeline } from "./components";
import { Workflow, smithers, outputs } from "./smithers";

export default smithers((ctx) => {
  const discoverOutput = ctx.latest("discover", "discover-codex");
  const tickets = discoverOutput?.tickets ?? [];
  const unfinishedTickets = tickets.filter(
    (t: any) => !ctx.latest("report", `${t.id}:report`)
  );

  return (
    <Workflow name="worktree-feature">
      <Sequence>
        <Branch if={tickets.length === 0} then={<Discover />} />
        {unfinishedTickets.map((ticket: any) => (
          <TicketPipeline key={ticket.id} ticket={ticket} />
        ))}
      </Sequence>
    </Workflow>
  );
});
```

## Agents — agents.ts

```tsx
// scripts/worktree-feature/agents.ts
import { ToolLoopAgent as Agent, stepCountIs } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
import { openai } from "@ai-sdk/openai";
import { ClaudeCodeAgent, CodexAgent } from "smithers-orchestrator";
import { SYSTEM_PROMPT } from "./system-prompt";

const USE_CLI = process.env.USE_CLI_AGENTS !== "0";
const UNSAFE = process.env.SMITHERS_UNSAFE === "1";

// Claude — switches between API agent and CLI agent
const claudeApi = new Agent({
  model: anthropic("claude-opus-4-6"),
  instructions: SYSTEM_PROMPT,
  stopWhen: stepCountIs(100),
});

const claudeCli = new ClaudeCodeAgent({
  model: "claude-opus-4-6",
  systemPrompt: SYSTEM_PROMPT,
  dangerouslySkipPermissions: UNSAFE,
  timeoutMs: 30 * 60 * 1000,
});

export const claude = USE_CLI ? claudeCli : claudeApi;

// Codex — CLI agent (CodexAgent does not have an API mode)
export const codex = new CodexAgent({
  model: "gpt-5.3-codex",
  systemPrompt: SYSTEM_PROMPT,
  yolo: UNSAFE,
  timeoutMs: 30 * 60 * 1000,
});
```

## Validation Loop — ValidationLoop.tsx

```tsx
// scripts/worktree-feature/components/ValidationLoop.tsx
import { Loop, Sequence } from "smithers-orchestrator";
import { Implement } from "./Implement";
import { Validate } from "./Validate";
import { Review } from "./Review";
import { ReviewFix } from "./ReviewFix";
import { useCtx } from "../smithers";

const MAX_REVIEW_ROUNDS = 3;

export function ValidationLoop({ ticket }: { ticket: { id: string } }) {
  const ctx = useCtx();
  const ticketId = ticket.id;

  const claudeReview = ctx.latest("review", `${ticketId}:review-claude`);
  const codexReview = ctx.latest("review", `${ticketId}:review-codex`);

  const allApproved = !!claudeReview?.approved && !!codexReview?.approved;

  return (
    <Loop
      id={`${ticketId}:impl-review-loop`}
      until={allApproved}
      maxIterations={MAX_REVIEW_ROUNDS}
      onMaxReached="return-last"
    >
      <Sequence>
        <Implement ticket={ticket} />
        <Validate ticket={ticket} />
        <Review ticket={ticket} />
        <ReviewFix ticket={ticket} />
      </Sequence>
    </Loop>
  );
}
```

## Parallel Review — Review.tsx

```tsx
// scripts/worktree-feature/components/Review.tsx
import { Parallel } from "smithers-orchestrator";
import { Task, useCtx, outputs } from "../smithers";
import { claude, codex } from "../agents";
import ReviewPrompt from "./Review.mdx";

export function Review({ ticket }: { ticket: { id: string; title: string } }) {
  const ctx = useCtx();
  const ticketId = ticket.id;
  const latestValidate = ctx.latest("validate", `${ticketId}:validate`);

  if (!latestValidate?.allPassed) return null;

  return (
    <Parallel>
      <Task
        id={`${ticketId}:review-claude`}
        output={outputs.review}
        agent={claude}
        timeoutMs={15 * 60 * 1000}
        continueOnFail
      >
        <ReviewPrompt ticketId={ticketId} reviewer="claude" />
      </Task>

      <Task
        id={`${ticketId}:review-codex`}
        output={outputs.review}
        agent={codex}
        timeoutMs={15 * 60 * 1000}
        continueOnFail
      >
        <ReviewPrompt ticketId={ticketId} reviewer="codex" />
      </Task>
    </Parallel>
  );
}
```

## Ticket Pipeline — TicketPipeline.tsx

```tsx
// scripts/worktree-feature/components/TicketPipeline.tsx
import { Sequence } from "smithers-orchestrator";
import { ValidationLoop } from "./ValidationLoop";
import { Report } from "./Report";
import { useCtx } from "../smithers";

export function TicketPipeline({ ticket }: { ticket: { id: string } }) {
  const ctx = useCtx();
  const latestReport = ctx.latest("report", `${ticket.id}:report`);
  const ticketComplete = latestReport != null;

  return (
    <Sequence key={ticket.id} skipIf={ticketComplete}>
      <ValidationLoop ticket={ticket} />
      <Report ticket={ticket} />
    </Sequence>
  );
}
```

## Running

```bash
cd scripts/worktree-feature
bun install
./run.sh
```

## Key Patterns

- **`createSmithers`** registers 6 output schemas; generates typed `tables`, `outputs`, and `Task` components.
- **`ClaudeCodeAgent` / `CodexAgent`** run real CLI tools with full filesystem access.
- **`<Loop>`** iterates implement/validate/review/fix until both reviewers approve or `MAX_REVIEW_ROUNDS` exhausted.
- **`<Parallel>`** runs dual review simultaneously; both must approve.
- **`ctx.latest(schemaKey, nodeId)`** reads the highest-iteration output for a task.
- **MDX prompts** -- `.mdx` files serve as prompt templates with JSX interpolation.
- **`skipIf`** skips already-completed tickets on resume.
- **`continueOnFail`** prevents a single review failure from blocking the pipeline.
- **Dynamic ticket mapping** -- `unfinishedTickets.map()` renders one `TicketPipeline` per ticket.

---

## Ghost: Worktree Feature Schemas

> Example from scripts/worktree-feature/components/*.schema.ts — Zod schema definitions for all pipeline stages: discover, implement, validate, review, review-fix, and report.
> Source: https://smithers.sh/examples/worktree-feature-schemas

# scripts/worktree-feature/ — Zod Output Schemas

> **Note:** **Ghost doc** — Real schema files from `scripts/worktree-feature/components/`.


## Discover Schema

```ts
// components/Discover.schema.ts
import { z } from "zod";

export const Ticket = z.object({
  id: z.string().describe("Unique slug identifier (lowercase kebab-case)"),
  title: z.string().describe("Short imperative title"),
  description: z.string().describe("Detailed description"),
  acceptanceCriteria: z.array(z.string()).describe("List of acceptance criteria"),
  filesToModify: z.array(z.string()).describe("Files to modify"),
  filesToCreate: z.array(z.string()).describe("Files to create"),
  dependencies: z.array(z.string()).nullable().describe("IDs of tickets this depends on"),
});
export type Ticket = z.infer<typeof Ticket>;

export const DiscoverOutput = z.object({
  tickets: z.array(Ticket).describe("All tickets ordered by dependency"),
  reasoning: z.string().describe("Why these tickets in this order"),
});
export type DiscoverOutput = z.infer<typeof DiscoverOutput>;
```

## Implement Schema

```ts
// components/Implement.schema.ts
import { z } from "zod";

export const ImplementOutput = z.object({
  filesCreated: z.array(z.string()).nullable().describe("Files created"),
  filesModified: z.array(z.string()).nullable().describe("Files modified"),
  commitMessages: z.array(z.string()).describe("Git commit messages made"),
  whatWasDone: z.string().describe("Detailed description of what was implemented"),
  testsWritten: z.array(z.string()).describe("Test files written"),
  docsUpdated: z.array(z.string()).describe("Documentation files updated"),
  allTestsPassing: z.boolean().describe("Whether all tests pass after implementation"),
  testOutput: z.string().describe("Output from running tests"),
});
export type ImplementOutput = z.infer<typeof ImplementOutput>;
```

## Validate Schema

```ts
// components/Validate.schema.ts
import { z } from "zod";

export const ValidateOutput = z.object({
  allPassed: z.boolean().describe("Whether tests exited with status 0"),
  failingSummary: z.string().nullable().describe("Summary of what failed (null if all passed)"),
  fullOutput: z.string().describe("Full output from test runner"),
});
export type ValidateOutput = z.infer<typeof ValidateOutput>;
```

## Review Schema

```ts
// components/Review.schema.ts
import { z } from "zod";

export const ReviewOutput = z.object({
  reviewer: z.string().default("unknown").describe("Which agent reviewed (claude, codex)"),
  approved: z.boolean().describe("Whether the reviewer approves (LGTM)"),
  issues: z.array(z.object({
    severity: z.enum(["critical", "major", "minor", "nit"]),
    file: z.string(),
    line: z.number().nullable(),
    description: z.string(),
    suggestion: z.string().nullable(),
  })).describe("Issues found during review"),
  testCoverage: z.enum(["excellent", "good", "insufficient", "missing"]),
  codeQuality: z.enum(["excellent", "good", "needs-work", "poor"]),
  feedback: z.string().describe("Overall review feedback"),
});
export type ReviewOutput = z.infer<typeof ReviewOutput>;
```

## ReviewFix Schema

```ts
// components/ReviewFix.schema.ts
import { z } from "zod";

export const ReviewFixOutput = z.object({
  fixesMade: z.array(z.object({
    issue: z.string(),
    fix: z.string(),
    file: z.string(),
  })).describe("Fixes applied"),
  falsePositiveComments: z.array(z.object({
    file: z.string(),
    line: z.number(),
    issue: z.string().describe("The review issue that was a false positive"),
    rationale: z.string().describe("Why this is a false positive"),
  })).nullable().describe("False positives to suppress in future reviews"),
  commitMessages: z.array(z.string()).describe("Commit messages for fixes"),
  allIssuesResolved: z.boolean().describe("Whether all review issues were resolved"),
  summary: z.string().describe("Summary of fixes"),
});
export type ReviewFixOutput = z.infer<typeof ReviewFixOutput>;
```

## Report Schema

```ts
// components/Report.schema.ts
import { z } from "zod";

export const ReportOutput = z.object({
  ticketTitle: z.string().describe("Title of the ticket"),
  status: z.enum(["completed", "partial", "failed"]).describe("Final status"),
  summary: z.string().describe("Concise summary of what was implemented"),
  filesChanged: z.number().describe("Number of files changed"),
  testsAdded: z.number().describe("Number of tests added"),
  reviewRounds: z.number().describe("How many review rounds it took"),
  struggles: z.array(z.string()).nullable().describe("Any struggles or issues encountered"),
  lessonsLearned: z.array(z.string()).nullable().describe("Lessons for future tickets"),
});
export type ReportOutput = z.infer<typeof ReportOutput>;
```

## Registration

All schemas register in one `createSmithers` call:

```ts
// smithers.ts
import { createSmithers } from "smithers-orchestrator";
import { DiscoverOutput } from "./components/Discover.schema";
import { ImplementOutput } from "./components/Implement.schema";
import { ValidateOutput } from "./components/Validate.schema";
import { ReviewOutput } from "./components/Review.schema";
import { ReviewFixOutput } from "./components/ReviewFix.schema";
import { ReportOutput } from "./components/Report.schema";

export const { Workflow, Task, useCtx, smithers, tables, outputs } = createSmithers({
  discover: DiscoverOutput,
  implement: ImplementOutput,
  validate: ValidateOutput,
  review: ReviewOutput,
  reviewFix: ReviewFixOutput,
  report: ReportOutput,
});
```

Use `outputs.discover`, `outputs.review`, etc. as the `output` prop on `<Task>`.

## Key Patterns

- **Dual export** -- each file exports the Zod schema and its inferred `type`, giving runtime validation and compile-time types from one source.
- **`.describe()` annotations** -- Smithers passes these to the LLM as field-level instructions in the structured output schema.
- **`.nullable()`** -- ensures optional fields are always present in output JSON (e.g., `dependencies`, `failingSummary`).
- **`z.enum()`** -- constrains agent output to valid values (severity levels, status codes, quality ratings).
- **Schema composition** -- `Ticket` defined once, reused in `DiscoverOutput.tickets` and as a prop type throughout the pipeline.
- **Single registration** -- `createSmithers` auto-generates SQLite tables; `outputs` provides typed references for `<Task output={outputs.xxx}>`.

---

## Ghost: Worktree Feature MDX Prompts

> Example from scripts/worktree-feature/components/*.mdx — MDX prompt templates used in the worktree-feature pipeline for discovery, implementation, review, validation, and reporting.
> Source: https://smithers.sh/examples/worktree-feature-prompts

# scripts/worktree-feature/ — MDX Prompt Templates

> **Note:** **Ghost doc** — Real MDX prompt files from `scripts/worktree-feature/components/`.


## Usage

Each `.mdx` file is imported as a JSX component and rendered as children of `<Task>`. Props interpolate via `{props.xxx}`:

```tsx
import ImplementPrompt from "./Implement.mdx";

<Task id="implement" output={outputs.implement} agent={codex}>
  <ImplementPrompt
    ticketId="vcs-jj-rewrite"
    ticketTitle="Rewrite VCS layer"
    ticketDescription="Full rewrite of the VCS abstraction"
    acceptanceCriteria="Tests pass"
  />
</Task>
```

## Discover.mdx

```mdx
TICKET DISCOVERY — Break PRD into Ordered Implementation Tickets

GOAL: Break the PRD into ordered, independently-implementable tickets.

STEPS:
1. Read the PRD thoroughly
2. Explore the codebase to understand current state
3. Break the PRD into tickets ordered by dependency
4. Each ticket should be the smallest independently testable unit

TICKET ID RULES:
- IDs MUST be lowercase kebab-case slugs (e.g. "vcs-jj-rewrite")
- NEVER use numeric IDs like T-001 — they collide across runs
```

## Implement.mdx

```mdx
IMPLEMENTATION — Ticket: {props.ticketId} — {props.ticketTitle}

Implement FULLY end-to-end. Do NOT stop until fully implemented + ALL tests pass.

TICKET DESCRIPTION:
{props.ticketDescription}

ACCEPTANCE CRITERIA:
- {props.acceptanceCriteria}

FILES TO MODIFY: {JSON.stringify(props.filesToModify)}
FILES TO CREATE: {JSON.stringify(props.filesToCreate)}

{props.previousImplementation ? `
PREVIOUS ATTEMPT:
What was done: ${props.previousImplementation.whatWasDone}
Fix issues from previous attempt.` : ""}

{props.reviewFixes ? `
REVIEW FIXES NEEDED:
${props.reviewFixes}` : ""}

IMPLEMENTATION RULES:
1. Implement ticket FULLY — nothing unfinished
2. Follow existing framework patterns exactly
3. ALL commits go directly on main. NEVER create branches.
4. After implementing, run tests
5. If tests fail, fix before moving on.
```

## Review.mdx

```mdx
CODE REVIEW — Ticket: {props.ticketId} — {props.ticketTitle} — Reviewer: {props.reviewer}

EXTREMELY strict code reviewer.

FILES CHANGED:
Created: {JSON.stringify(props.filesCreated)}
Modified: {JSON.stringify(props.filesModified)}

Review against:
1. CORRECTNESS — Matches PRD exactly?
2. CODE QUALITY — DRY? Follows patterns?
3. TEST COVERAGE — Every edge case?
4. TYPE SAFETY — No `any` or unsafe casts?

APPROVAL POLICY:
- ANY way to improve the code → MUST be improved.
- approved: true ONLY when there are genuinely ZERO issues.
```

## Validate.mdx

```mdx
VALIDATION — Ticket: {props.ticketId} — {props.ticketTitle}

Independently verify implementation correctness.
Don't trust implementation agent claims — run everything yourself.

CANONICAL CHECK: Run `bun test`
ALL tests must pass.

ZERO TOLERANCE:
- ALL tests pass. No exceptions.
- Type errors count as failures.
```

## ReviewFix.mdx

```mdx
REVIEW FIX — Ticket: {props.ticketId} — {props.ticketTitle}

REVIEW ISSUES:
{JSON.stringify(props.issues, null, 2)}

REVIEW FEEDBACK:
{props.feedback}

RULES:
1. Fix every legitimate issue
2. If FALSE POSITIVE: record in output JSON
3. Run tests after fixes
```

## Report.mdx

```mdx
REPORTING — Ticket: {props.ticketId} — {props.ticketTitle}

IMPLEMENTATION SUMMARY:
{props.whatWasDone}

PRE-COMPUTED METRICS (echo these back exactly):
- filesChanged: {props.filesChanged}
- reviewRounds: {props.reviewRounds}

Assess: Anything go wrong? Agent struggle? Lessons for future?
```

## System Prompt — system-prompt.mdx

The top-level system prompt uses custom MDX components to inject context:

```mdx
# Smithers Framework — Worktree + MergeQueue Implementation

## PRD
<Prd />

## Smithers Framework Context
<SmithersContext />

## Coding Conventions
- Follow existing patterns exactly
- Run `bun test` to validate changes
- Atomic commits with emoji prefixes
```

Components are injected at render time via `renderMdx()`:

```ts
import { renderMdx } from "smithers-orchestrator";
import SystemPromptMdx from "./prompts/system-prompt.mdx";

export const SYSTEM_PROMPT = renderMdx(SystemPromptMdx, {
  components: { Prd: () => prdContent, SmithersContext: () => contextContent },
});
```

## Key Patterns

- **MDX as prompt templates** -- structured prompts with JSX interpolation for dynamic content.
- **Props-driven** -- ticket data, previous results, and feedback pass as props.
- **Conditional sections** -- `{props.previousImplementation ? ... : ""}` adds context only when iterating.
- **`renderMdx()`** -- composes system prompts from multiple sources using custom MDX components.

---

## Ghost: scripts/worktree-feature/run.sh

> Example from scripts/worktree-feature/run.sh — Shell script launcher for the worktree-feature workflow with environment setup for CLI agents, debug mode, and unsafe permissions.
> Source: https://smithers.sh/examples/worktree-feature-run-sh

# scripts/worktree-feature/run.sh

> **Note:** **Ghost doc** — Real launcher script at `scripts/worktree-feature/run.sh`.


## Source

```bash
#!/usr/bin/env bash
# Run the Worktree+MergeQueue feature workflow
# Usage: ./run.sh

set -euo pipefail

SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
ROOT_DIR="$(cd "$SCRIPT_DIR/../../../.." && pwd)"

cd "$SCRIPT_DIR"

export USE_CLI_AGENTS=1
export SMITHERS_DEBUG=1
export SMITHERS_UNSAFE=1
unset ANTHROPIC_API_KEY

SMITHERS_CLI="${SMITHERS_CLI:-./node_modules/.bin/smithers}"

echo "Starting Worktree+MergeQueue feature workflow"
echo "Root directory: $ROOT_DIR"
echo "Press Ctrl+C to stop."
echo ""

bun "$SMITHERS_CLI" run workflow.tsx --input '{}' --root "$ROOT_DIR"
```

## package.json

```json
{
  "name": "worktree-feature-workflow",
  "type": "module",
  "scripts": {
    "start": "bun run workflow.tsx",
    "resume": "smithers up workflow.tsx --run-id <run-id> --resume true",
    "typecheck": "tsc --noEmit"
  },
  "dependencies": {
    "@ai-sdk/anthropic": "^3.0.36",
    "@ai-sdk/openai": "^2.0.0",
    "ai": "^6.0.69",
    "smithers-orchestrator": "file:../../",
    "zod": "^4.3.6"
  }
}
```

## config.ts

```ts
// scripts/worktree-feature/config.ts

/** Maximum review->fix rounds before the validation loop gives up. */
export const MAX_REVIEW_ROUNDS = 3;

/** Steps per review round (implement + validate + review + reviewfix). */
export const STEPS_PER_ROUND = 4;
```

## preload.ts

```ts
// scripts/worktree-feature/preload.ts
import { mdxPlugin } from "smithers-orchestrator/mdx-plugin";

mdxPlugin();
```

## Key Details

- `USE_CLI_AGENTS=1` selects CLI agents (Claude Code / Codex CLI) over API agents. `SMITHERS_UNSAFE=1` enables `dangerouslySkipPermissions` for unattended execution.
- `"smithers-orchestrator": "file:../../"` links to the local package for co-development.
- `preload.ts` registers the MDX plugin, enabling `.mdx` imports as JSX components.
- `--root` passes the repository root so agents access the full codebase, not just the workflow directory.
- `smithers up workflow.tsx --run-id <run-id> --resume true` resumes from the last checkpoint.

---

## Ghost: scripts/generate-llms-txt.ts

> Example from scripts/generate-llms-txt.ts — A utility script that generates llms-full.txt context files from all MDX documentation pages.
> Source: https://smithers.sh/examples/generate-llms-txt

# scripts/generate-llms-txt.ts

> **Note:** **Ghost doc** — Real utility script at `scripts/generate-llms-txt.ts`.


## Source

```ts
// scripts/generate-llms-txt.ts
#!/usr/bin/env bun
import { writeFileSync } from "node:fs";
import { generateLlmsFull } from "./docs-utils";

const output = generateLlmsFull();

writeFileSync("docs/llms-full.txt", output);
console.log(
  `Generated docs/llms-full.txt (${output.length} chars, ~${Math.round(output.length / 4)} tokens)`,
);
```

## Running

```bash
bun scripts/generate-llms-txt.ts
```

## Key Details

- Follows the `llms.txt` convention: one text file containing all documentation for AI model context.
- Reads `docs/docs.json` so output tracks the current navigation tree.
- Strips YAML frontmatter; converts MDX components (`> **Warning:** `, `> **Tip:** `, `> **Note:** `) to blockquotes.
- Each section includes a source URL back to the live docs page.
- The generator, route preview server, and browser smoke tests share the same manifest helper, so route changes cannot drift.

## Related Validation

- `tests/docs-artifacts.test.ts` keeps the committed `docs/llms-full.txt` in sync with the current docs manifest.
- `tests/docs-e2e.playwright.ts` exercises docs routes and legacy redirects against a local preview server.

---

## Ghost: .github/workflows/ci.yml

> Example from .github/workflows/ci.yml — GitHub Actions CI workflow for running typecheck and tests on every push and pull request.
> Source: https://smithers.sh/examples/ci-workflow

# .github/workflows/ci.yml

> **Note:** **Ghost doc** -- Real CI configuration from `.github/workflows/ci.yml`.


## Source

```yaml
# .github/workflows/ci.yml
name: CI

on:
  push:
  pull_request:

jobs:
  core:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: oven-sh/setup-bun@v1
        with:
          bun-version: "1.3.4"
      - name: Install
        run: bun install --frozen-lockfile
      - name: Typecheck
        run: bun run typecheck
      - name: Test
        run: bun test
```

## Notes

- Uses `oven-sh/setup-bun` for fast Bun-based CI.
- `--frozen-lockfile` pins exact dependency versions.
- Typecheck and tests run as separate steps -- type errors surface even if tests pass.
- Triggers on all pushes and pull requests with no branch filtering.

---

## Ghost: AGENTS.md

> Example from AGENTS.md — Repository instructions for AI coding agents working on the Smithers Bun/TypeScript codebase, its docs, examples, tests, and Burns subproject.
> Source: https://smithers.sh/examples/agents-md

# AGENTS.md

> **Note:** **Ghost doc** — The real `AGENTS.md` from the Smithers repository root.


## Source

```markdown
# Smithers Repository — Agent Instructions

You are working in the Smithers repository, a Bun/TypeScript codebase centered on the `smithers-orchestrator` package.

## Repository Overview

- **Core runtime**: `src/` — Smithers workflow engine, JSX components, CLI, integrations, and observability
- **Docs**: `docs/` — Mintlify source docs
- **Examples**: `examples/` — runnable TSX workflows
- **Tests**: `tests/` — Bun tests, Playwright docs checks, runtime regression coverage
- **Burns**: `burns/` — workspace-first local control plane for Smithers with web, daemon, desktop, and CLI apps

## Common Commands

From the repository root:

```bash
bun test
bun run typecheck
bun run e2e
bun run docs
```

From `burns/`:

```bash
bun run dev:daemon
bun run dev:web
bun run desktop:dev
bun run typecheck
```

## Key Conventions

- The published package name is `smithers-orchestrator`, not `smithers`
- The main public JSX API is `createSmithers(...)`
- Source docs live under `docs/`; `docs/llms-full.txt` is generated from those pages
- `examples/` should stay runnable against the current public API
- The repo may be used from Git or JJ-based workspaces depending on the developer environment; do not assume one workflow unless the local task requires it
```

---

## Ghost: Claude Code Plugin — Smithers Skill

> Example from ~/.claude/plugins/smithers/ — A Claude Code plugin that teaches Claude how to create and monitor Smithers orchestrations.
> Source: https://smithers.sh/examples/claude-plugin-skill

# Claude Code Plugin — Smithers Skill

> **Note:** **Ghost doc** — Real Claude Code plugin at `~/.claude/plugins/smithers/`. Registers the `smithers` skill so Claude Code can create and run workflows.


## plugin.json

```json
{
  "name": "smithers",
  "version": "0.1.0",
  "description": "Build AI agents with declarative JSX for Claude orchestration",
  "author": "William Cory",
  "license": "MIT",
  "repository": "https://github.com/evmts/smithers",
  "keywords": ["orchestration", "multi-agent", "workflow", "ai-agents", "claude", "jsx"],
  "skills": ["skills/smithers"]
}
```

## SKILL.md

```markdown
---
name: smithers-orchestrator
description: Create and monitor multi-agent AI orchestrations using Smithers framework.
allowed-tools: [Read, Write, Edit, Bash, Glob, Grep, Task]
user-invocable: true
recommend-plan-mode: true
---

# Smithers Orchestrator

## When to Use
- Orchestrate multiple AI agents working together
- Create complex multi-phase workflows
- Build agent pipelines with state management

## Quick Start
1. Define schemas with `createSmithers({ output: z.object({...}) })`
2. Create agents with `new Agent({ model, instructions })`
3. Build workflow with `smithers((ctx) => <Workflow>...</Workflow>)`
4. Run with `smithers up workflow.tsx --input '{}'`
```

## EXAMPLES.md

Five complete workflow examples:

1. **Simple Sequential** — Three-phase research/implement/test pipeline
2. **Conditional Branching** — Branches based on analysis results
3. **Parallel Execution** — Frontend/backend/database agents simultaneously
4. **Error Handling and Retry** — Automatic retry with recovery fallback
5. **Data Flow Between Phases** — Requirements/design/implement/test with structured data passing

## REFERENCE.md

| Component | Purpose |
|-----------|---------|
| `<Workflow>` | Root component — defines a named workflow |
| `<Task>` | Executes an agent or static payload, persists structured output |
| `<Sequence>` | Runs children sequentially |
| `<Parallel>` | Runs children concurrently with optional `maxConcurrency` |
| `<Branch>` | Conditional rendering — `if`/`then`/`else` |
| `<Loop>` | Loop controller — iterates `until` condition or `maxIterations` |

---

## Ghost: Claude Code Plugin — Smithers Orchestrator

> Example from ~/.claude/plugins/smithers-orchestrator/ — A Claude Code plugin for multi-agent orchestration with monitoring and phase tracking.
> Source: https://smithers.sh/examples/claude-plugin-orchestrator

# Claude Code Plugin — Smithers Orchestrator

> **Note:** **Ghost doc** — Real Claude Code plugin at `~/.claude/plugins/smithers-orchestrator/`.


## plugin.json

```json
{
  "name": "smithers-orchestrator",
  "version": "1.0.0",
  "description": "Multi-agent orchestration framework using Smithers.",
  "author": "Smithers Framework Contributors",
  "license": "MIT",
  "skills": ["skills/smithers-orchestrator"]
}
```

## Monitor Output Format

```
[10:30:00] ◆ PHASE: Research      Status: STARTING
[10:30:01] ● AGENT: Claude        Status: RUNNING
[10:30:05] ⚡ TOOL CALL: Read     File: src/index.ts
[10:30:12] ✓ PHASE: Research      Status: COMPLETE
```

## Workflow Template

```tsx
import { createSmithers, Sequence, Task, Workflow } from "smithers-orchestrator";
import { ToolLoopAgent as Agent } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
import { z } from "zod";

const { smithers, outputs } = createSmithers({
  research: z.object({ findings: z.string() }),
  summary: z.object({ text: z.string() }),
});

const researcher = new Agent({
  model: anthropic("claude-sonnet-4-5-20250929"),
  instructions: "Research the topic thoroughly.",
});

const writer = new Agent({
  model: anthropic("claude-sonnet-4-5-20250929"),
  instructions: "Write a clear, concise summary.",
});

export default smithers((ctx) => (
  <Workflow name="research-workflow">
    <Sequence>
      <Task id="research" output={outputs.research} agent={researcher}>
        {`Research: ${ctx.input.topic}`}
      </Task>
      <Task id="summarize" output={outputs.summary} agent={writer}>
        {`Summarize: ${ctx.outputMaybe("research", { nodeId: "research" })?.findings}`}
      </Task>
    </Sequence>
  </Workflow>
));
```

## Best Practices

1. Use `createSmithers` for schema-driven workflows with auto-persistence.
2. Use `outputs.xxx` as the `output` prop on `<Task>` (the Zod schema, not a string key).
3. Use `ctx.outputMaybe()` for cross-task data flow (returns `undefined` if not yet available).
4. Set `maxIterations` on `<Loop>` to prevent infinite loops.
5. Include `continueOnFail` on non-critical tasks.

---

## Core Type Reference

> Reference for the core exported Smithers runtime, component, server, and error types.
> Source: https://smithers.sh/reference/types

The types on this page are the core JSX/runtime types most Smithers users work with directly. They are exported from `smithers-orchestrator` unless noted otherwise.

Additional public type families are also exported for adjacent surfaces:

| Family | Exports |
|---|---|
| Workflow builder | `CreateSmithersApi` |
| Serve app | `ServeOptions` |
| Observability | `ResolvedSmithersObservabilityOptions`, `SmithersLogFormat`, `SmithersObservabilityOptions`, `SmithersObservabilityService` |
| Agents | `AnthropicAgentOptions`, `OpenAIAgentOptions`, `PiAgentOptions`, `PiExtensionUiRequest`, `PiExtensionUiResponse` |
| Scorers | `ScoreResult`, `ScorerInput`, `ScorerFn`, `Scorer`, `SamplingConfig`, `ScorerBinding`, `ScorersMap`, `ScoreRow`, `AggregateScore`, `ScorerContext`, `CreateScorerConfig`, `LlmJudgeConfig`, `AggregateOptions` from `smithers-orchestrator/scorers` |
| Renderer / builder internals | `HostContainer`, `SmithersSqliteOptions` |
| VCS helpers | `RunJjOptions`, `RunJjResult`, `JjRevertResult`, `WorkspaceAddOptions`, `WorkspaceResult`, `WorkspaceInfo` |

## Workflow Types

### SmithersWorkflow\<Schema\>

```ts
type SmithersWorkflow<Schema> = {
  db: unknown;
  build: (ctx: SmithersCtx<Schema>) => React.ReactElement;
  opts: SmithersWorkflowOptions;
  schemaRegistry?: Map<string, SchemaRegistryEntry>;
};
```

| Field | Type | Description |
|---|---|---|
| `db` | `unknown` | Drizzle ORM database instance |
| `build` | `(ctx: SmithersCtx<Schema>) => ReactElement` | Renders the workflow JSX tree |
| `opts` | `SmithersWorkflowOptions` | Workflow-level options |
| `schemaRegistry` | `Map<string, SchemaRegistryEntry>` | Output table names to schema entries |

---

### SmithersWorkflowOptions

```ts
type SmithersWorkflowOptions = {
  cache?: boolean;
};
```

| Field | Type | Default | Description |
|---|---|---|---|
| `cache` | `boolean` | `undefined` | Enable task output caching across runs |

---

### SchemaRegistryEntry

```ts
type SchemaRegistryEntry = {
  table: any;
  zodSchema: import("zod").ZodObject<any>;
};
```

| Field | Type | Description |
|---|---|---|
| `table` | `any` | Drizzle ORM table definition |
| `zodSchema` | `ZodObject<any>` | Zod schema for output validation |

---

## Context Types

### SmithersCtx\<Schema\>

```ts
interface SmithersCtx<Schema> {
  runId: string;
  iteration: number;
  iterations?: Record<string, number>;
  input: Schema extends { input: infer T } ? T : any;
  auth: RunAuthContext | null;
  outputs: OutputAccessor<Schema>;

  output<T extends keyof Schema>(
    table: Schema[T],
    key: OutputKey,
  ): InferRow<Schema[T]>;

  outputMaybe<T extends keyof Schema>(
    table: Schema[T],
    key: OutputKey,
  ): InferRow<Schema[T]> | undefined;

  latest(table: any, nodeId: string): any;

  latestArray(value: unknown, schema: import("zod").ZodType): any[];

  iterationCount(table: any, nodeId: string): number;
}
```

| Field / Method | Type | Description |
|---|---|---|
| `runId` | `string` | Current run ID |
| `iteration` | `number` | Current Loop iteration (0 outside loops) |
| `iterations` | `Record<string, number>` | Loop ID to current iteration count |
| `input` | Inferred from Schema | Typed input data |
| `auth` | `RunAuthContext \| null` | Authentication context passed via `RunOptions.auth`. `null` when no auth context is configured. |
| `outputs` | `OutputAccessor<Schema>` | Accessor for all output rows |
| `output(table, key)` | `InferRow<T>` | Get output row. Throws if missing. The `table` parameter accepts a Zod schema from `outputs`, a Drizzle table, or a string schema key. |
| `outputMaybe(table, key)` | `InferRow<T> \| undefined` | Get output row or `undefined`. Same table resolution as `output()`. |
| `latest(table, nodeId)` | `any` | Latest output row (highest iteration). Same table resolution as `output()`. |
| `latestArray(value, schema)` | `any[]` | JSON-parse `value` if it is a string, coerce to array, then validate each element against `schema` (Zod). Invalid items are silently dropped. |
| `iterationCount(table, nodeId)` | `number` | Distinct iteration count for a node |

---

### OutputKey

```ts
type OutputKey = {
  nodeId: string;
  iteration?: number;
};
```

| Field | Type | Default | Description |
|---|---|---|---|
| `nodeId` | `string` | -- | Task node ID |
| `iteration` | `number` | `0` | Loop iteration |

---

### OutputAccessor\<Schema\>

Callable object that retrieves output rows. Can be called as a function or accessed as a property.

```ts
type OutputAccessor<Schema> = ((table: any) => any[]) & Record<string, any[]>;
```

---

### InferRow\<TTable\>

Extracts the select row type from a Drizzle table.

```ts
type InferRow<TTable> = TTable extends { $inferSelect: infer R } ? R : never;
```

---

## Run Types

### RunOptions

```ts
type RunOptions = {
  runId?: string;
  parentRunId?: string;
  input: Record<string, unknown>;
  maxConcurrency?: number;
  onProgress?: (e: SmithersEvent) => void;
  signal?: AbortSignal;
  resume?: boolean;
  force?: boolean;
  workflowPath?: string;
  rootDir?: string;
  logDir?: string | null;
  allowNetwork?: boolean;
  maxOutputBytes?: number;
  toolTimeoutMs?: number;
  hot?: boolean | HotReloadOptions;
  auth?: RunAuthContext;
  cliAgentToolsDefault?: "all" | "explicit-only";
};
```

| Field | Type | Default | Description |
|---|---|---|---|
| `runId` | `string` | Auto-generated | Custom run identifier. Falls back to `randomUUID()` when omitted. |
| `parentRunId` | `string` | `undefined` | Parent run ID for child workflow / subflow tracking. |
| `input` | `Record<string, unknown>` | -- | Input data (required) |
| `maxConcurrency` | `number` | `4` | Max parallel tasks |
| `onProgress` | `(e: SmithersEvent) => void` | `undefined` | Lifecycle event callback |
| `signal` | `AbortSignal` | `undefined` | Cancellation signal |
| `resume` | `boolean` | `false` | Resume from last checkpoint |
| `force` | `boolean` | `false` | Allow resume even if the run is still active (overrides liveness check) |
| `workflowPath` | `string` | `undefined` | Workflow file path (for tool context) |
| `rootDir` | `string` | `undefined` | Sandbox root for tools |
| `logDir` | `string \| null` | `undefined` | Event log directory (`null` to disable) |
| `allowNetwork` | `boolean` | `false` | Allow `bash` network access |
| `maxOutputBytes` | `number` | `200000` | Max output bytes per tool call |
| `toolTimeoutMs` | `number` | `60000` | Tool execution timeout (ms) |
| `hot` | `boolean \| HotReloadOptions` | `undefined` | Hot-reload mode |
| `auth` | `RunAuthContext` | `undefined` | Authentication context. Accessible as `ctx.auth` inside the workflow. |
| `cliAgentToolsDefault` | `"all" \| "explicit-only"` | `"all"` | Default tool access policy for CLI-backed agents. `"explicit-only"` restricts agents to tools listed in `allowTools`. |

---

### HotReloadOptions

```ts
type HotReloadOptions = {
  rootDir?: string;
  outDir?: string;
  maxGenerations?: number;
  cancelUnmounted?: boolean;
  debounceMs?: number;
};
```

| Field | Type | Default | Description |
|---|---|---|---|
| `rootDir` | `string` | Auto-detect | Directory to watch |
| `outDir` | `string` | `.smithers/hmr/<runId>` | Generation overlay directory |
| `maxGenerations` | `number` | `3` | Max overlay generations |
| `cancelUnmounted` | `boolean` | `false` | Cancel unmounted tasks after reload |
| `debounceMs` | `number` | `100` | File change debounce (ms) |

---

### RunResult

```ts
type RunResult = {
  runId: string;
  status: "finished" | "failed" | "cancelled" | "continued" | "waiting-approval" | "waiting-event" | "waiting-timer";
  output?: unknown;
  error?: unknown;
};
```

| Field | Type | Description |
|---|---|---|
| `runId` | `string` | Run identifier |
| `status` | `string` | Terminal status |
| `output` | `unknown` | Final output (if finished) |
| `error` | `unknown` | Error details (if failed) |

---

### RunStatus

```ts
type RunStatus =
  | "running"
  | "waiting-approval"
  | "waiting-event"
  | "waiting-timer"
  | "finished"
  | "continued"
  | "failed"
  | "cancelled";
```

| Value | Description |
|---|---|
| `"running"` | Actively executing |
| `"waiting-approval"` | Awaiting human approval |
| `"waiting-event"` | Awaiting an external signal or event |
| `"waiting-timer"` | Suspended until a durable timer fires |
| `"finished"` | All tasks completed |
| `"continued"` | Run ended via `<ContinueAsNew>` and a fresh run has started |
| `"failed"` | Unrecoverable task failure |
| `"cancelled"` | Cancelled via `AbortSignal` or API |

---

## Task Types

### TaskDescriptor

Internal representation of a task extracted from the JSX tree. Scheduled and executed by the engine.

```ts
type TaskDescriptor = {
  nodeId: string;
  ordinal: number;
  iteration: number;
  ralphId?: string;
  dependsOn?: string[];
  needs?: Record<string, string>;
  worktreeId?: string;
  worktreePath?: string;
  worktreeBranch?: string;

  outputTable: Table | null;
  outputTableName: string;
  outputRef?: import("zod").ZodObject<any>;
  outputSchema?: import("zod").ZodObject<any>;

  parallelGroupId?: string;
  parallelMaxConcurrency?: number;

  needsApproval: boolean;
  approvalMode?: "gate" | "decision";
  approvalOnDeny?: "fail" | "continue" | "skip";
  skipIf: boolean;
  retries: number;
  retryPolicy?: RetryPolicy;
  timeoutMs: number | null;
  continueOnFail: boolean;
  cachePolicy?: CachePolicy;

  agent?: AgentLike | AgentLike[];
  prompt?: string;
  staticPayload?: unknown;
  computeFn?: () => unknown | Promise<unknown>;

  label?: string;
  meta?: Record<string, unknown>;
};
```

| Field | Type | Description |
|---|---|---|
| `nodeId` | `string` | Unique task identifier (from `id` prop) |
| `ordinal` | `number` | Execution order position (depth-first) |
| `iteration` | `number` | Current Loop iteration |
| `ralphId` | `string` | Parent Loop ID |
| `dependsOn` | `string[]` | Explicit dependency node IDs |
| `needs` | `Record<string, string>` | Named dependencies (keys = context keys, values = node IDs) |
| `worktreeId` | `string` | Assigned git worktree ID |
| `worktreePath` | `string` | Worktree filesystem path |
| `worktreeBranch` | `string` | Worktree branch name |
| `outputTable` | `Table \| null` | Drizzle table for output persistence |
| `outputTableName` | `string` | Output table name |
| `outputRef` | `ZodObject<any>` | Zod schema reference from `output` prop |
| `outputSchema` | `ZodObject<any>` | Zod schema for validating agent output |
| `parallelGroupId` | `string` | Parent `<Parallel>` group ID |
| `parallelMaxConcurrency` | `number` | Concurrency limit from parent `<Parallel>` |
| `needsApproval` | `boolean` | Requires human approval |
| `approvalMode` | `"gate" \| "decision"` | `"gate"` pauses before execution; `"decision"` records a decision |
| `approvalOnDeny` | `"fail" \| "continue" \| "skip"` | Behavior on denial |
| `skipIf` | `boolean` | Skip this task |
| `retries` | `number` | Retry attempts on failure |
| `retryPolicy` | `RetryPolicy` | Backoff configuration |
| `timeoutMs` | `number \| null` | Task timeout (ms) |
| `continueOnFail` | `boolean` | Continue workflow on failure |
| `cachePolicy` | `CachePolicy` | Cache configuration |
| `agent` | `AgentLike \| AgentLike[]` | Agent or fallback chain |
| `prompt` | `string` | Prompt text (from children) |
| `staticPayload` | `unknown` | Pre-computed payload (non-agent tasks) |
| `computeFn` | `() => unknown \| Promise<unknown>` | Compute callback |
| `label` | `string` | Display label |
| `meta` | `Record<string, unknown>` | Arbitrary metadata |

---

### AgentLike

```ts
type AgentLike = Agent<any, any, any>;
```

---

### RetryPolicy

```ts
type RetryBackoff = "fixed" | "linear" | "exponential";

type RetryPolicy = {
  backoff?: RetryBackoff;
  initialDelayMs?: number;
};
```

| Field | Type | Default | Description |
|---|---|---|---|
| `backoff` | `"fixed" \| "linear" \| "exponential"` | `"fixed"` | Backoff strategy. `"fixed"`: constant delay. `"linear"`: delay increases by `initialDelayMs` each attempt. `"exponential"`: delay doubles each attempt. |
| `initialDelayMs` | `number` | `0` | Base delay in milliseconds. When `0`, retries execute immediately with no schedule. |

Backoff delay formulas (attempt 1-indexed):

| Strategy | Delay formula |
|---|---|
| `"fixed"` | `initialDelayMs` every attempt |
| `"linear"` | `initialDelayMs * attempt` |
| `"exponential"` | `initialDelayMs * 2^(attempt - 1)` |

When `retries` is set on a `<Task>` without a `retryPolicy`, the task retries immediately (no delay).

---

### RetryTaskOptions

Options for the programmatic `retryTask()` function, which resets a task and its dependents for re-execution.

```ts
type RetryTaskOptions = {
  runId: string;
  nodeId: string;
  iteration?: number;
  resetDependents?: boolean;
  force?: boolean;
  onProgress?: (event: SmithersEvent) => void;
};
```

| Field | Type | Default | Description |
|---|---|---|---|
| `runId` | `string` | -- | Run ID containing the task to retry |
| `nodeId` | `string` | -- | Node ID of the task to retry |
| `iteration` | `number` | `0` | Loop iteration to retry |
| `resetDependents` | `boolean` | `true` | Also reset downstream tasks that depend on this node |
| `force` | `boolean` | `false` | Allow retry even if the run is still active |
| `onProgress` | `(event: SmithersEvent) => void` | `undefined` | Event callback for `RetryTaskStarted` and `RetryTaskFinished` events |

---

### RetryTaskResult

```ts
type RetryTaskResult = {
  success: boolean;
  resetNodes: string[];
  error?: string;
};
```

| Field | Type | Description |
|---|---|---|
| `success` | `boolean` | Whether the retry operation succeeded |
| `resetNodes` | `string[]` | Node IDs that were reset to `"pending"` |
| `error` | `string` | Error message if the retry operation failed |

---

### CachePolicy

```ts
type CachePolicy<Ctx = any> = {
  by?: (ctx: Ctx) => unknown;
  version?: string;
};
```

| Field | Type | Default | Description |
|---|---|---|---|
| `by` | `(ctx: Ctx) => unknown` | `undefined` | Cache key function. Same key reuses cached output. |
| `version` | `string` | `undefined` | Cache version. Changing it invalidates cached outputs. |

---

## Graph Types

### GraphSnapshot

```ts
type GraphSnapshot = {
  runId: string;
  frameNo: number;
  xml: XmlNode | null;
  tasks: TaskDescriptor[];
};
```

| Field | Type | Description |
|---|---|---|
| `runId` | `string` | Run identifier |
| `frameNo` | `number` | Render frame number (monotonically increasing) |
| `xml` | `XmlNode \| null` | Rendered XML tree |
| `tasks` | `TaskDescriptor[]` | Ordered task list |

---

### XmlNode

```ts
type XmlNode = XmlElement | XmlText;
```

---

### XmlElement

```ts
type XmlElement = {
  kind: "element";
  tag: string;
  props: Record<string, string>;
  children: XmlNode[];
};
```

| Field | Type | Description |
|---|---|---|
| `kind` | `"element"` | Discriminant |
| `tag` | `string` | Tag name (`"Workflow"`, `"Task"`, `"Parallel"`, etc.) |
| `props` | `Record<string, string>` | Attributes |
| `children` | `XmlNode[]` | Child nodes |

---

### XmlText

```ts
type XmlText = {
  kind: "text";
  text: string;
};
```

| Field | Type | Description |
|---|---|---|
| `kind` | `"text"` | Discriminant |
| `text` | `string` | Content |

---

## Event Types

### SmithersEvent

Discriminated union of all lifecycle events. Every event includes `runId` and `timestampMs`.

```ts
type SmithersEvent =
  | RunStarted
  | RunStatusChanged
  | RunFinished
  | RunFailed
  | RunCancelled
  | RunContinuedAsNew
  | RunHijackRequested
  | RunHijacked
  | FrameCommitted
  | NodePending
  | NodeStarted
  | NodeFinished
  | NodeFailed
  | NodeCancelled
  | NodeSkipped
  | NodeRetrying
  | NodeWaitingApproval
  | ApprovalRequested
  | ApprovalGranted
  | ApprovalDenied
  | ToolCallStarted
  | ToolCallFinished
  | NodeOutput
  | AgentEvent
  | RevertStarted
  | RevertFinished
  | WorkflowReloadDetected
  | WorkflowReloaded
  | WorkflowReloadFailed
  | WorkflowReloadUnsafe
  | ScorerStarted
  | ScorerFinished
  | ScorerFailed
  | TokenUsageReported;
```

#### Run Events

| Event | Fields | Description |
|---|---|---|
| `RunStarted` | `runId`, `timestampMs` | Execution began |
| `RunStatusChanged` | `runId`, `status: RunStatus`, `timestampMs` | Status transition |
| `RunFinished` | `runId`, `timestampMs` | All tasks completed |
| `RunFailed` | `runId`, `error: unknown`, `timestampMs` | Failed |
| `RunCancelled` | `runId`, `timestampMs` | Cancelled |
| `RunContinuedAsNew` | `runId`, `newRunId`, `iteration`, `carriedStateSize`, `ancestryDepth?`, `timestampMs` | Continued as new run |
| `RunHijackRequested` | `runId`, `target?`, `timestampMs` | Hijack requested |
| `RunHijacked` | `runId`, `nodeId`, `iteration`, `attempt`, `engine`, `mode`, `resume?`, `cwd`, `timestampMs` | Hijack completed |

#### Frame Events

| Event | Fields | Description |
|---|---|---|
| `FrameCommitted` | `runId`, `frameNo`, `xmlHash`, `timestampMs` | Frame persisted |

#### Node Events

| Event | Fields | Description |
|---|---|---|
| `NodePending` | `runId`, `nodeId`, `iteration`, `timestampMs` | Queued |
| `NodeStarted` | `runId`, `nodeId`, `iteration`, `attempt`, `timestampMs` | Execution began |
| `NodeFinished` | `runId`, `nodeId`, `iteration`, `attempt`, `timestampMs` | Completed |
| `NodeFailed` | `runId`, `nodeId`, `iteration`, `attempt`, `error`, `timestampMs` | Failed |
| `NodeCancelled` | `runId`, `nodeId`, `iteration`, `attempt?`, `reason?`, `timestampMs` | Cancelled |
| `NodeSkipped` | `runId`, `nodeId`, `iteration`, `timestampMs` | Skipped |
| `NodeRetrying` | `runId`, `nodeId`, `iteration`, `attempt`, `timestampMs` | Retrying |

#### Approval Events

| Event | Fields | Description |
|---|---|---|
| `NodeWaitingApproval` | `runId`, `nodeId`, `iteration`, `timestampMs` | Awaiting approval |
| `ApprovalRequested` | `runId`, `nodeId`, `iteration`, `timestampMs` | Approval requested |
| `ApprovalGranted` | `runId`, `nodeId`, `iteration`, `timestampMs` | Approved |
| `ApprovalDenied` | `runId`, `nodeId`, `iteration`, `timestampMs` | Denied |

#### Tool Events

| Event | Fields | Description |
|---|---|---|
| `ToolCallStarted` | `runId`, `nodeId`, `iteration`, `attempt`, `toolName`, `seq`, `timestampMs` | Tool call began |
| `ToolCallFinished` | `runId`, `nodeId`, `iteration`, `attempt`, `toolName`, `seq`, `status`, `timestampMs` | Tool call completed |

#### Output Events

| Event | Fields | Description |
|---|---|---|
| `NodeOutput` | `runId`, `nodeId`, `iteration`, `attempt`, `text`, `stream`, `timestampMs` | Agent output text |

#### Revert Events

| Event | Fields | Description |
|---|---|---|
| `RevertStarted` | `runId`, `nodeId`, `iteration`, `attempt`, `jjPointer`, `timestampMs` | VCS revert started |
| `RevertFinished` | `runId`, `nodeId`, `iteration`, `attempt`, `jjPointer`, `success`, `error?`, `timestampMs` | VCS revert completed |

---

## Component Props

### WorkflowProps

```ts
type WorkflowProps = {
  name: string;
  cache?: boolean;
  children?: React.ReactNode;
};
```

| Prop | Type | Required | Description |
|---|---|---|---|
| `name` | `string` | Yes | Workflow name for logging and database |
| `cache` | `boolean` | No | Enable output caching |
| `children` | `ReactNode` | No | Child components |

---

### TaskProps\<Row, D\>

```ts
type TaskProps<Row, Output, D extends DepsSpec = {}> = {
  key?: string;
  id: string;
  output: Output;
  outputSchema?: import("zod").ZodObject<any>;
  agent?: AgentLike | AgentLike[];
  fallbackAgent?: AgentLike;
  dependsOn?: string[];
  needs?: Record<string, string>;
  deps?: D;
  skipIf?: boolean;
  needsApproval?: boolean;
  timeoutMs?: number;
  retries?: number;
  retryPolicy?: RetryPolicy;
  continueOnFail?: boolean;
  cache?: CachePolicy;
  scorers?: ScorersMap;
  label?: string;
  meta?: Record<string, unknown>;
  children: string | Row | (() => Row | Promise<Row>) | React.ReactNode | ((deps: InferDeps<D>) => Row | React.ReactNode);
};
```

| Prop | Type | Required | Description |
|---|---|---|---|
| `id` | `string` | Yes | Unique task identifier |
| `output` | `ZodObject<any> \| Table \| string` | Yes | Output target: Zod schema from `outputs` (recommended), Drizzle table, or string key |
| `outputSchema` | `ZodObject<any>` | No | Zod schema for agent output validation |
| `agent` | `AgentLike \| AgentLike[]` | No | Agent or fallback chain |
| `fallbackAgent` | `AgentLike` | No | Append one fallback agent |
| `dependsOn` | `string[]` | No | Explicit dependency node IDs |
| `needs` | `Record<string, string>` | No | Named dependencies (keys = context keys, values = node IDs) |
| `deps` | `Record<string, OutputTarget>` | No | Typed render-time dependencies. Keys resolve from task ids or `needs` entries. |
| `skipIf` | `boolean` | No | Skip when true |
| `needsApproval` | `boolean` | No | Pause for human approval |
| `timeoutMs` | `number` | No | Timeout (ms) |
| `retries` | `number` | No | Retry attempts (default: 0) |
| `retryPolicy` | `RetryPolicy` | No | Retry timing |
| `continueOnFail` | `boolean` | No | Continue workflow on failure |
| `cache` | `CachePolicy` | No | Cache configuration |
| `scorers` | `ScorersMap` | No | Scorers to evaluate task output after completion |
| `label` | `string` | No | Display label |
| `meta` | `Record<string, unknown>` | No | Arbitrary metadata |
| `children` | `string \| Row \| (() => Row) \| ReactNode \| ((deps) => Row \| ReactNode)` | Yes | Prompt, compute callback, static payload, deps function, or nested elements |

---

### SequenceProps

```ts
type SequenceProps = {
  skipIf?: boolean;
  children?: React.ReactNode;
};
```

| Prop | Type | Required | Description |
|---|---|---|---|
| `skipIf` | `boolean` | No | Skip entire sequence when true |
| `children` | `ReactNode` | No | Child tasks |

---

### ParallelProps

```ts
type ParallelProps = {
  id?: string;
  maxConcurrency?: number;
  skipIf?: boolean;
  children?: React.ReactNode;
};
```

| Prop | Type | Required | Description |
|---|---|---|---|
| `id` | `string` | No | Group identifier |
| `maxConcurrency` | `number` | No | Max concurrent tasks |
| `skipIf` | `boolean` | No | Skip entire group when true |
| `children` | `ReactNode` | No | Child tasks |

---

### BranchProps

```ts
type BranchProps = {
  if: boolean;
  then: React.ReactElement;
  else?: React.ReactElement;
  skipIf?: boolean;
};
```

| Prop | Type | Required | Description |
|---|---|---|---|
| `if` | `boolean` | Yes | Condition |
| `then` | `ReactElement` | Yes | Rendered when true |
| `else` | `ReactElement` | No | Rendered when false |
| `skipIf` | `boolean` | No | Skip entirely when true |

---

### LoopProps

```ts
type LoopProps = {
  id?: string;
  until: boolean;
  maxIterations?: number;
  onMaxReached?: "fail" | "return-last";
  skipIf?: boolean;
  children?: React.ReactNode;
};
```

| Prop | Type | Required | Description |
|---|---|---|---|
| `id` | `string` | No | Loop identifier (auto-generated if omitted) |
| `until` | `boolean` | Yes | Exit when true |
| `maxIterations` | `number` | No | Max iterations |
| `onMaxReached` | `"fail" \| "return-last"` | No | Behavior on limit: fail or return last output |
| `skipIf` | `boolean` | No | Skip loop when true |
| `children` | `ReactNode` | No | Tasks per iteration |

---

### RalphProps

> **Deprecated:** Use `LoopProps`.

```ts
type RalphProps = LoopProps;
```

---

### ApprovalProps\<Row\>

```ts
type ApprovalProps<Row = ApprovalDecision> = {
  id: string;
  output: ZodObject<any> | Table | string;
  outputSchema?: import("zod").ZodObject<any>;
  request: ApprovalRequest;
  onDeny?: "fail" | "continue" | "skip";
  dependsOn?: string[];
  needs?: Record<string, string>;
  skipIf?: boolean;
  timeoutMs?: number;
  retries?: number;
  retryPolicy?: RetryPolicy;
  continueOnFail?: boolean;
  cache?: CachePolicy;
  label?: string;
  meta?: Record<string, unknown>;
  key?: string;
  children?: React.ReactNode;
};

type ApprovalRequest = {
  title: string;
  summary?: string;
  metadata?: Record<string, unknown>;
};

type ApprovalDecision = {
  approved: boolean;
  note: string | null;
  decidedBy: string | null;
  decidedAt: string | null;
};
```

| Prop | Type | Required | Description |
|---|---|---|---|
| `id` | `string` | Yes | Approval node identifier |
| `output` | `ZodObject<any> \| Table \| string` | Yes | Persistence target for decision |
| `outputSchema` | `ZodObject<any>` | No | Decision output validation |
| `request` | `ApprovalRequest` | Yes | Title, summary, metadata |
| `onDeny` | `"fail" \| "continue" \| "skip"` | No | Behavior on denial |
| `dependsOn` | `string[]` | No | Dependency node IDs |
| `needs` | `Record<string, string>` | No | Named dependencies |
| `skipIf` | `boolean` | No | Skip when true |
| `timeoutMs` | `number` | No | Timeout (ms) |
| `retries` | `number` | No | Retry attempts |
| `retryPolicy` | `RetryPolicy` | No | Retry backoff |
| `continueOnFail` | `boolean` | No | Continue on failure |
| `cache` | `CachePolicy` | No | Cache configuration |
| `label` | `string` | No | Display label (defaults to `request.title`) |
| `meta` | `Record<string, unknown>` | No | Arbitrary metadata |
| `children` | `ReactNode` | No | Child elements |

---

## Error Types

### SmithersError

```ts
type SmithersError = {
  code: SmithersErrorCode;
  message: string;
  summary: string;
  docsUrl: string;
  details?: Record<string, unknown>;
  cause?: unknown;
};
```

| Field | Type | Description |
|---|---|---|
| `code` | `SmithersErrorCode` | Machine-readable error code |
| `message` | `string` | Error description |
| `summary` | `string` | Short error summary without docs URL suffixing |
| `docsUrl` | `string` | Documentation URL for the error reference |
| `details` | `Record<string, unknown>` | Additional context |
| `cause` | `unknown` | Original nested cause when one is preserved |

### SmithersErrorCode

```ts
type SmithersErrorCode =
  | KnownSmithersErrorCode
  | (string & {});
```

Use `KnownSmithersErrorCode` when you want exhaustive switching over built-in Smithers failures. See [Error Reference](/reference/errors) for the full built-in list.

### KnownSmithersErrorCode

```ts
type KnownSmithersErrorCode =
  | "INVALID_INPUT"
  | "MISSING_INPUT"
  | "MISSING_INPUT_TABLE"
  // ... all built-in Smithers runtime codes
```

This union excludes the custom string escape hatch and is the right type for exhaustive `switch` statements over built-in Smithers errors.

---

## Server Types

### ServerOptions

See [HTTP Server](/integrations/server) for details.

```ts
type ServerOptions = {
  port?: number;
  db?: BunSQLiteDatabase<any>;
  authToken?: string;
  maxBodyBytes?: number;
  rootDir?: string;
  allowNetwork?: boolean;
};
```

### ServeOptions

Options for `createServeApp(...)`, the single-run Hono app exported from the root package.

```ts
type ServeOptions = {
  workflow: SmithersWorkflow<any>;
  adapter: SmithersDb;
  runId: string;
  abort: AbortController;
  authToken?: string;
  metrics?: boolean;
};
```

---

## Tool Context Types

### ToolContext

Internal context provided to tools via `AsyncLocalStorage`. Not typically used directly.

```ts
type ToolContext = {
  db: SmithersDb;
  runId: string;
  nodeId: string;
  iteration: number;
  attempt: number;
  rootDir: string;
  allowNetwork: boolean;
  maxOutputBytes: number;
  timeoutMs: number;
  seq: number;
};
```

---

## Package Configuration

> Reference for the smithers-orchestrator package exports, binary entry, TypeScript configuration, and Bun preload setup.
> Source: https://smithers.sh/reference/package-configuration

This page documents the build and package configuration shipped with `smithers-orchestrator`. Use it when setting up a new project, debugging import resolution, or understanding why your `tsconfig.json` needs specific options.

## Binary

```json
"bin": {
  "smithers": "src/cli/index.ts"
}
```

After installing `smithers-orchestrator`, the `smithers` command is available via `bunx smithers-orchestrator` or globally if linked. See [CLI Reference](/cli/overview) for all commands.

## Subpath Exports

Every public import path is listed below. Use the subpath form to import only the surface you need.

| Import path | Entry file | Purpose |
|---|---|---|
| `smithers-orchestrator` | `./src/index.ts` | Core API: `createSmithers`, components, `runWorkflow`, `renderMdx`, errors |
| `smithers-orchestrator/gateway` | `./src/gateway/index.ts` | Gateway client for remote workflow coordination |
| `smithers-orchestrator/jsx-runtime` | `./src/jsx-runtime.ts` | JSX runtime (auto-resolved by `jsxImportSource`) |
| `smithers-orchestrator/jsx-dev-runtime` | `./src/jsx-runtime.ts` | JSX dev runtime (auto-resolved in dev mode) |
| `smithers-orchestrator/tools` | `./src/tools/index.ts` | Tool sandbox: `defineTool`, `read`, `grep`, `bash`, `edit`, `write` |
| `smithers-orchestrator/server` | `./src/server/index.ts` | HTTP server for run management and event streaming |
| `smithers-orchestrator/observability` | `./src/observability/index.ts` | OpenTelemetry traces, metrics, and Grafana stack integration |
| `smithers-orchestrator/pi-plugin` | `./src/pi-plugin/index.ts` | PI CLI agent plugin |
| `smithers-orchestrator/pi-extension` | `./src/pi-plugin/extension.ts` | PI extension UI bridge |
| `smithers-orchestrator/mdx-plugin` | `./src/mdx-plugin.ts` | Bun preload plugin for `.mdx` imports |
| `smithers-orchestrator/dom/renderer` | `./src/dom/renderer.ts` | Internal renderer (advanced use) |
| `smithers-orchestrator/serve` | `./src/server/serve.ts` | Single-workflow HTTP server via `createServeApp` |
| `smithers-orchestrator/scorers` | `./src/scorers/index.ts` | Eval scorers: `createScorer`, `llmJudge`, `aggregate` |
| `smithers-orchestrator/voice` | `./src/voice/index.ts` | Voice input/output integration |
| `smithers-orchestrator/rag` | `./src/rag/index.ts` | RAG document ingestion and retrieval |
| `smithers-orchestrator/memory` | `./src/memory/index.ts` | Cross-run memory storage and recall |
| `smithers-orchestrator/openapi` | `./src/openapi/index.ts` | Generate AI SDK tools from OpenAPI specs |

### Usage

```ts
// Core API
import { createSmithers, runWorkflow } from "smithers-orchestrator";

// Tools
import { defineTool, bash, read, write } from "smithers-orchestrator/tools";

// Scorers
import { createScorer, llmJudge } from "smithers-orchestrator/scorers";

// MDX plugin (in preload.ts)
import { mdxPlugin } from "smithers-orchestrator/mdx-plugin";
```

## TypeScript Configuration

### JSX Import Source

```json
{
  "compilerOptions": {
    "jsx": "react-jsx",
    "jsxImportSource": "smithers-orchestrator"
  }
}
```

This tells TypeScript to resolve JSX transforms from `smithers-orchestrator/jsx-runtime` instead of `react/jsx-runtime`. The Smithers JSX runtime re-exports React's runtime, so component behavior is identical -- but this setting enables proper type resolution for Smithers workflow components.

See [JSX Installation](/jsx/installation) for the complete TypeScript setup.

### Path Aliases

If you are developing inside the `smithers-orchestrator` monorepo, the root `tsconfig.json` defines path aliases so that source imports resolve without a build step:

```json
"paths": {
  "smithers": ["./src/index.ts"],
  "smithers/jsx-runtime": ["./src/jsx-runtime.ts"],
  "smithers/jsx-dev-runtime": ["./src/jsx-runtime.ts"],
  "smithers/tools": ["./src/tools/index.ts"],
  "smithers-orchestrator": ["./src/index.ts"],
  "smithers-orchestrator/tools": ["./src/tools/index.ts"],
  "smithers-orchestrator/scorers": ["./src/scorers/index.ts"]
}
```

The `smithers-orchestrator` entries are backward-compatibility aliases. The package was renamed from `smithers-orchestrator` to `smithers` internally, and these aliases ensure that existing imports and example code continue to resolve.

**End users do not need path aliases.** Path aliases are only needed when developing the framework itself. When you install `smithers-orchestrator` as a dependency, Node/Bun module resolution handles import paths automatically.

### Local Type Root Shims

```json
"typeRoots": ["./src/types", "./node_modules/@types"]
```

The `./src/types` directory contains ambient type declarations that fill gaps in third-party packages. Currently it ships a single shim:

- `react-dom-server.d.ts` -- Declares the `react-dom/server` module so TypeScript does not error when server-side rendering types are referenced.

End users should add `@types/react-dom` to their `devDependencies` instead of relying on this shim.

## Bun Configuration

### Runtime Preload

```toml
# bunfig.toml
preload = ["./preload.ts"]
```

The preload script registers the MDX esbuild plugin with Bun's bundler so that `.mdx` files can be imported as JSX components at runtime. See [MDX Prompts](/guides/mdx-prompts) for details.

### Test Configuration

```toml
[test]
root = "./tests"
preload = ["./preload.ts"]
```

| Key | Value | Purpose |
|---|---|---|
| `root` | `./tests` | Bun discovers test files from this directory instead of scanning the entire project |
| `preload` | `["./preload.ts"]` | Registers the MDX plugin for test files so `.mdx` imports work in tests |

The test preload is separate from the runtime preload. Both point to the same file, but Bun's `[test]` section only applies when running `bun test`. Without it, tests that import `.mdx` files would fail with a module resolution error.

## npm Scripts

These scripts are defined in the root `package.json` for development:

| Script | Command | Purpose |
|---|---|---|
| `typecheck` | `tsc --noEmit` | Type-check the `src/` and `tests/` trees against `tsconfig.json` |
| `typecheck:examples` | `tsc -p tsconfig.examples.json --noEmit` | Type-check example files against a separate config that maps `smithers` to `examples-entry.ts` |
| `lint` | `oxlint ...` | Lint source, test, and CLI code with oxlint |
| `test` | `bash ./scripts/run-all-tests.sh` | Run the full test suite |
| `e2e` | `playwright test` | Run Playwright end-to-end tests against the docs site and integration surfaces |
| `docs` | `cd docs && bunx mintlify dev` | Start the Mintlify docs dev server for local preview |

### For end-user projects

When scaffolding your own project (via `smithers init` or manually), add a typecheck script:

```json
{
  "scripts": {
    "typecheck": "tsc --noEmit"
  }
}
```

See [Production Project Structure](/guides/project-structure) for a complete user-project `package.json` example.

---

## VCS Helper Reference

> Public JJ helper APIs exported by smithers-orchestrator for repo detection, snapshot inspection, and workspace management.
> Source: https://smithers.sh/reference/vcs-helpers

Smithers exports a small JJ helper surface for applications that want to inspect or manage Jujutsu state directly.

These helpers are intentionally lightweight:

- every helper accepts an optional `cwd` so you can target a specific repository
- spawn failures are normalized instead of throwing, which makes them safe to call even when `jj` is not installed
- workspace helpers try a few command shapes to tolerate JJ version drift

## Import

```ts
import {
  runJj,
  getJjPointer,
  revertToJjPointer,
  isJjRepo,
  workspaceAdd,
  workspaceList,
  workspaceClose,
} from "smithers-orchestrator";
```

## `runJj(args, opts?)`

Run an arbitrary `jj` command and capture its output.

```ts
const result = await runJj(["status"], { cwd: "/path/to/repo" });
```

```ts
type RunJjOptions = {
  cwd?: string;
};

type RunJjResult = {
  code: number;
  stdout: string;
  stderr: string;
};
```

Notes:

- returns `{ code: 127, stdout: "", stderr: "..." }` when `jj` cannot be started
- does not throw for ordinary process failures
- useful when you need a raw escape hatch beyond the higher-level helpers below

## `getJjPointer(cwd?)`

Return the current workspace `change_id` for `@`, or `null` when JJ is unavailable or the current directory is not a JJ repo.

```ts
const pointer = await getJjPointer("/path/to/repo");
```

```ts
function getJjPointer(cwd?: string): Promise<string | null>;
```

Smithers uses the same pointer model internally for revert support and cache invalidation.

## `revertToJjPointer(pointer, cwd?)`

Restore the working copy from a previously recorded JJ pointer.

```ts
const result = await revertToJjPointer("zqkopwvn", "/path/to/repo");
```

```ts
type JjRevertResult =
  | { success: true }
  | { success: false; error?: string };
```

This helper wraps `jj restore --from <pointer>`.

## `isJjRepo(cwd?)`

Detect whether a directory is a readable JJ repository.

```ts
const enabled = await isJjRepo("/path/to/repo");
```

```ts
function isJjRepo(cwd?: string): Promise<boolean>;
```

Use this before showing JJ-specific UI or attempting a revert flow.

## `workspaceAdd(name, path, opts?)`

Create a JJ workspace with a friendly name at a target filesystem path.

```ts
const result = await workspaceAdd("feature-auth", "/tmp/wt-feature-auth", {
  cwd: "/path/to/repo",
  atRev: "@",
});
```

```ts
type WorkspaceAddOptions = {
  cwd?: string;
  atRev?: string;
};

type WorkspaceResult =
  | { success: true }
  | { success: false; error?: string };
```

Behavior notes:

- removes an existing workspace with the same name before retrying
- recreates the target directory if needed
- tries multiple `jj workspace add` syntaxes to work across JJ versions

## `workspaceList(cwd?)`

List known workspaces for the current JJ repo.

```ts
const workspaces = await workspaceList("/path/to/repo");
```

```ts
type WorkspaceInfo = {
  name: string;
  path: string | null;
  selected: boolean;
};
```

The current implementation prefers template output when supported, then falls back to parsing the human-readable `jj workspace list` output.

## `workspaceClose(name, opts?)`

Forget a JJ workspace by name.

```ts
const result = await workspaceClose("feature-auth", {
  cwd: "/path/to/repo",
});
```

```ts
function workspaceClose(
  name: string,
  opts?: { cwd?: string },
): Promise<WorkspaceResult>;
```

This wraps `jj workspace forget <name>`.

## When To Use These Helpers

Use these helpers when your application needs to:

- show whether JJ-backed revert is available
- record or inspect a pointer outside the Smithers engine
- manage JJ workspaces directly from an app or integration layer

If you only need workflow-level revert behavior, prefer the runtime and CLI docs:

- [VCS Integration](/guides/vcs)
- [CLI Reference](/cli/overview)
- [Revert](/runtime/revert)

---

## Error Reference

> Exhaustive Smithers error codes, typed error helpers, and HTTP API error responses.
> Source: https://smithers.sh/reference/errors

```ts
import {
  ERROR_REFERENCE_URL,
  SmithersErrorInstance,
  errorToJson,
  getSmithersErrorDefinition,
  getSmithersErrorDocsUrl,
  isKnownSmithersErrorCode,
  isSmithersError,
  knownSmithersErrorCodes,
} from "smithers-orchestrator";
import type {
  KnownSmithersErrorCode,
  SmithersError,
  SmithersErrorCode,
} from "smithers-orchestrator";
```

Every built-in `SmithersErrorInstance` now carries three pieces of documentation metadata:

| Field | Meaning |
|---|---|
| `message` | Human-readable message with a docs link appended. |
| `summary` | The raw message without the docs suffix. |
| `docsUrl` | The reference URL for Smithers errors. |

Use `KnownSmithersErrorCode` when you want an exhaustive switch over the built-in Smithers codes. `SmithersErrorCode` still includes the `(string & {})` escape hatch for user-defined custom codes.

| Export | Kind | Description |
|---|---|---|
| `SmithersErrorInstance` | class | Runtime error class used throughout Smithers internals. |
| `isSmithersError(err)` | function | Type guard for values carrying a Smithers-style `code`. |
| `isKnownSmithersErrorCode(code)` | function | Narrows a string to the built-in exhaustive error-code union. |
| `knownSmithersErrorCodes` | value | Array of every built-in Smithers error code documented on this page. |
| `getSmithersErrorDocsUrl(code)` | function | Returns the docs URL appended to built-in error messages. |
| `getSmithersErrorDefinition(code)` | function | Returns category, description, and details metadata for known codes. |
| `errorToJson(err)` | function | Serializes `message`, `summary`, `docsUrl`, `code`, `details`, `cause`, and `stack`. |
| `ERROR_REFERENCE_URL` | value | Base docs URL for Smithers runtime errors. |
| `KnownSmithersErrorCode` | type | Exact built-in Smithers code union. |
| `SmithersErrorCode` | type | Built-in codes plus the custom string escape hatch. |
| `SmithersError` | type | Public typed shape for serialized Smithers errors. |

```ts
try {
  await runWorkflow(workflow, { input: {} });
} catch (err) {
  if (isSmithersError(err) && isKnownSmithersErrorCode(err.code)) {
    switch (err.code) {
      case "INVALID_INPUT":
        console.error("Bad input:", err.summary);
        break;
      case "AGENT_CLI_ERROR":
        console.error("Agent failed:", err.summary);
        break;
      default:
        console.error(`[${err.code}] ${err.summary}`);
    }

    console.error("Docs:", err.docsUrl);
  }
}
```

## Engine

| Code | When | Details |
|---|---|---|
| `INVALID_INPUT` | Workflow input fails validation or the runtime receives a non-object input payload. | -- |
| `MISSING_INPUT` | A resume run references an input row that is missing from the database. | -- |
| `MISSING_INPUT_TABLE` | The workflow schema does not expose the expected input table during resume or hydration. | -- |
| `RESUME_METADATA_MISMATCH` | Stored run metadata no longer matches the workflow being resumed. | -- |
| `UNKNOWN_OUTPUT_SCHEMA` | A task references an output table that is not present in the schema registry. | -- |
| `INVALID_OUTPUT` | Agent output cannot be parsed or validated against the declared output schema. | -- |
| `WORKTREE_CREATE_FAILED` | Smithers fails to create or hydrate a git or jj worktree for a task. | `{ worktreePath, vcsType, branch? }` |
| `VCS_NOT_FOUND` | No supported git or jj repository root can be found for the workflow. | `{ rootDir }` |
| `SNAPSHOT_NOT_FOUND` | A requested time-travel snapshot or frame does not exist. | `{ runId, frameNo }` |
| `VCS_WORKSPACE_CREATE_FAILED` | Smithers fails to materialize a jj workspace for time-travel or replay. | `{ runId, frameNo, vcsPointer, workspacePath }` |
| `TASK_TIMEOUT` | A task compute callback exceeds its configured timeout. | `{ nodeId, attempt, timeoutMs }` |
| `TASK_ABORTED` | A running task is aborted through an AbortSignal or shutdown path. | -- |
| `RUN_NOT_FOUND` | A CLI or engine command references a run ID that does not exist in the database. | `{ runId }` |
| `NODE_NOT_FOUND` | A CLI command references a node ID that does not exist for the given run. | `{ runId, nodeId }` |
| `UI_COMMAND_FAILED` | The smithers ui command fails to open the browser or probe the server. | `{ url }` |
| `INVALID_EVENTS_OPTIONS` | The smithers events command receives invalid filter options. | -- |
| `SANDBOX_BUNDLE_INVALID` | A sandbox bundle fails validation (missing README, invalid manifest, etc.). | `{ bundlePath }` |
| `SANDBOX_BUNDLE_TOO_LARGE` | A sandbox bundle exceeds the maximum allowed size. | `{ bundlePath, maxBytes }` |
| `WORKFLOW_EXECUTION_FAILED` | A child or builder workflow exits unsuccessfully without surfacing a typed error payload. | `{ status }` |
| `SANDBOX_EXECUTION_FAILED` | Sandbox setup or execution fails before a more specific sandbox error can be emitted. | `{ sandboxId, runId?, maxConcurrent?, activeSandboxCount? }` |
| `TASK_HEARTBEAT_TIMEOUT` | A task has not heartbeated within its configured timeout. | `{ nodeId, iteration, attempt, timeoutMs, staleForMs, lastHeartbeatAtMs }` |
| `HEARTBEAT_PAYLOAD_TOO_LARGE` | A task heartbeat payload exceeds the maximum allowed size. | `{ nodeId, sizeBytes, maxBytes }` |
| `HEARTBEAT_PAYLOAD_NOT_JSON_SERIALIZABLE` | A task heartbeat payload cannot be serialized to JSON. | `{ nodeId }` |

## Components

| Code | When | Details |
|---|---|---|
| `TASK_ID_REQUIRED` | `<Task>` is missing a valid string id. | -- |
| `TASK_MISSING_OUTPUT` | `<Task>` is missing its output prop. | `{ nodeId }` |
| `DUPLICATE_ID` | Two nodes with the same runtime id are mounted in one workflow graph. | `{ kind, id }` |
| `NESTED_LOOP` | `<Loop>` or `<Ralph>` is nested inside another loop construct that Smithers does not support. | -- |
| `WORKTREE_EMPTY_PATH` | `<Worktree>` is mounted with an empty path. | -- |
| `MDX_PRELOAD_INACTIVE` | A prompt object is rendered without the MDX preload layer being active. | -- |
| `CONTEXT_OUTSIDE_WORKFLOW` | Workflow context access happens outside an active Smithers workflow render. | -- |
| `MISSING_OUTPUT` | Code calls `ctx.output()` for a node result that does not exist. | `{ nodeId, iteration }` |
| `DEP_NOT_SATISFIED` | A typed dep on `<Task>` references an upstream output that has not been produced yet. | `{ taskId, depKey, resolvedNodeId }` |
| `ASPECT_BUDGET_EXCEEDED` | An Aspects budget (tokens, latency, or cost) has been exceeded. | `{ kind, limit, current }` |
| `APPROVAL_OUTSIDE_TASK` | `<Approval>` is resolved outside the active task runtime. | -- |
| `WORKFLOW_MISSING_DEFAULT` | A workflow module does not export a default Smithers workflow. | -- |

## Tools

| Code | When | Details |
|---|---|---|
| `TOOL_PATH_INVALID` | A filesystem tool receives a non-string path. | -- |
| `TOOL_PATH_ESCAPE` | A filesystem tool resolves a path outside the sandbox root, including through symlinks. | -- |
| `TOOL_FILE_TOO_LARGE` | A read or edit operation exceeds the configured file size limit. | -- |
| `TOOL_CONTENT_TOO_LARGE` | A write operation exceeds the configured content size limit. | -- |
| `TOOL_PATCH_TOO_LARGE` | An edit patch exceeds the configured patch size limit. | -- |
| `TOOL_PATCH_FAILED` | A unified diff patch cannot be applied to the target file. | -- |
| `TOOL_NETWORK_DISABLED` | The bash tool tries to access the network while network access is disabled. | -- |
| `TOOL_GIT_REMOTE_DISABLED` | The bash tool attempts a remote git operation while network access is disabled. | -- |
| `TOOL_COMMAND_FAILED` | A bash tool command exits with a non-zero status. | -- |
| `TOOL_GREP_FAILED` | The grep tool fails with an rg execution error. | -- |

## Agents

| Code | When | Details |
|---|---|---|
| `AGENT_CLI_ERROR` | A CLI-backed agent exits unsuccessfully, streams an explicit error, or its RPC transport fails. | -- |
| `AGENT_RPC_FILE_ARGS` | Pi RPC mode is used with file arguments that the transport does not support. | -- |
| `AGENT_BUILD_COMMAND` | An agent implementation forbids `buildCommand()` because it uses a custom `generate()` transport. | -- |
| `AGENT_DIAGNOSTIC_TIMEOUT` | An internal agent diagnostic check exceeds the per-check timeout budget. | -- |

## Database

| Code | When | Details |
|---|---|---|
| `DB_MISSING_COLUMNS` | A table used by Smithers does not expose required columns such as `runId` or `nodeId`. | -- |
| `DB_REQUIRES_BUN_SQLITE` | The database adapter is not backed by a Bun SQLite client with `exec()`. | -- |
| `DB_QUERY_FAILED` | A database read query throws or rejects while running inside an Effect. | -- |
| `DB_WRITE_FAILED` | A database write or migration fails, including after SQLite retry exhaustion. | -- |

## Effect / Runtime

| Code | When | Details |
|---|---|---|
| `INTERNAL_ERROR` | An unexpected internal exception crossed an Effect boundary without a more specific Smithers code. | -- |
| `PROCESS_ABORTED` | A spawned child process is aborted by signal or shutdown. | `{ command, args, cwd }` |
| `PROCESS_TIMEOUT` | A spawned child process exceeds its total timeout. | `{ command, args, cwd, timeoutMs }` |
| `PROCESS_IDLE_TIMEOUT` | A spawned child process stops producing output longer than its idle timeout. | `{ command, args, cwd, idleTimeoutMs }` |
| `PROCESS_SPAWN_FAILED` | The runtime cannot spawn the requested child process. | `{ command, args, cwd }` |
| `TASK_RUNTIME_UNAVAILABLE` | Builder task runtime APIs are accessed outside an executing step. | -- |

## Hot Reload

| Code | When | Details |
|---|---|---|
| `SCHEMA_CHANGE_HOT` | Hot reload detects a schema change that requires a full restart. | -- |
| `HOT_OVERLAY_FAILED` | Building or cleaning the generated hot-reload overlay fails. | -- |
| `HOT_RELOAD_INVALID_MODULE` | A hot-reloaded workflow module does not export a valid default workflow build. | -- |

## Scorers

| Code | When | Details |
|---|---|---|
| `SCORER_FAILED` | A scorer throws or rejects while Smithers is evaluating a result. | -- |

## CLI

| Code | When | Details |
|---|---|---|
| `WORKFLOW_EXISTS` | The workflow creation CLI refuses to overwrite an existing workflow file. | -- |
| `PROMPT_EXISTS` | The prompt creation CLI refuses to overwrite an existing prompt file. | -- |
| `PROMPT_MDX_INVALID` | An MDX prompt file does not export a valid default component. | -- |
| `TICKET_EXISTS` | The ticket creation CLI refuses to overwrite an existing ticket file. | -- |
| `TICKET_NOT_FOUND` | A CLI command references a ticket file that does not exist. | -- |
| `CLI_DB_NOT_FOUND` | A CLI command cannot find a nearby `smithers.db` file. | -- |
| `CLI_AGENT_UNSUPPORTED` | The ask command selects an agent integration that Smithers does not support in that mode. | -- |

## Integrations

| Code | When | Details |
|---|---|---|
| `PI_HTTP_ERROR` | The Pi or server integration receives a non-success HTTP response from Smithers. | -- |
| `EXTERNAL_BUILD_FAILED` | An external workflow host fails to build a Smithers HostNode payload. | `{ scriptPath, error?, exitCode?, stderr?, stdout? }` |
| `SCHEMA_DISCOVERY_FAILED` | External workflow schema discovery fails or returns invalid output. | `{ scriptPath, error?, exitCode?, stderr? }` |
| `OPENAPI_SPEC_LOAD_FAILED` | An OpenAPI spec cannot be loaded or parsed. | -- |
| `OPENAPI_OPERATION_NOT_FOUND` | The requested operationId does not exist in the OpenAPI spec. | -- |
| `OPENAPI_TOOL_EXECUTION_FAILED` | An OpenAPI tool call fails during HTTP execution. | -- |

## HTTP API Errors

These are JSON response codes, not `SmithersErrorInstance` objects.

| Code | Status | When |
|---|---|---|
| `INVALID_REQUEST` | 400 | Invalid request body or query params |
| `PAYLOAD_TOO_LARGE` | 413 | Body exceeds `maxBodyBytes` |
| `INVALID_JSON` | 400 | Body not valid JSON |
| `SERVER_ERROR` | 500 | Unexpected server error |
| `UNAUTHORIZED` | 401 | Missing or invalid auth token |
| `WORKFLOW_PATH_OUTSIDE_ROOT` | 400 | Workflow path outside server root |
| `RUN_ID_REQUIRED` | 400 | `runId` required when `resume: true` |
| `RUN_ALREADY_EXISTS` | 409 | Run ID already exists |
| `RUN_NOT_FOUND` | 404 | No run with given ID |
| `RUN_NOT_ACTIVE` | 409 | Run not active (cannot cancel) |
| `NOT_FOUND` | 404 | Route or resource not found |
| `DB_NOT_CONFIGURED` | 400 | Server database not configured |

## Related

- [Error Handling Guide](/guides/error-handling)
- [Debugging Guide](/guides/debugging)
- [Troubleshooting](/guides/troubleshooting)

---

## Effect Integration

> Low-level Effect-ts integration layer for power users who need direct access to Smithers internals.
> Source: https://smithers.sh/api/effect

The Effect integration layer is Smithers' third and lowest API tier. The TOON builder API and JSX mirror it, and most workflows never need to reach this far. You reach for it when you need direct control over execution boundaries, custom bridging logic, or the full expressiveness of the Effect type system against Smithers internals.

All modules live under `src/effect/` and are imported from `smithers-orchestrator/effect/*`. They assume familiarity with the [Effect](https://effect.website) library.

---

## Runners

Three functions run Effect programs inside the shared Smithers managed runtime. The runtime is initialized once, annotates all logs with `"service": "smithers"`, and normalizes failures to `SmithersError`.

### EFFECT_RUN_PROMISE

Runs an Effect and returns a `Promise`. The promise rejects with a `SmithersError` on failure.

```ts
import { runPromise } from "smithers-orchestrator/effect/runtime";
import { Effect } from "effect";

const result = await runPromise(
  Effect.gen(function* () {
    yield* Effect.log("starting");
    return 42;
  }),
  { signal: abortController.signal },
);
```

```ts
function runPromise<A, E, R>(
  effect: Effect.Effect<A, E, R>,
  options?: { signal?: AbortSignal },
): Promise<A>
```

### EFFECT_RUN_SYNC

Runs a synchronous Effect immediately. Throws `SmithersError` on failure. Use this only when you are certain the Effect performs no async work.

```ts
import { runSync } from "smithers-orchestrator/effect/runtime";

const value = runSync(Effect.succeed("hello"));
```

```ts
function runSync<A, E, R>(
  effect: Effect.Effect<A, E, R>,
): A
```

### EFFECT_RUN_FORK

Forks an Effect as a background fiber. Returns the fiber immediately without awaiting the result. Suitable for fire-and-forget side effects like metric updates.

```ts
import { runFork } from "smithers-orchestrator/effect/runtime";
import { Metric } from "effect";
import { runsTotal } from "smithers-orchestrator/effect/metrics";

runFork(Metric.increment(runsTotal));
```

```ts
function runFork<A, E, R>(
  effect: Effect.Effect<A, E, R>,
): Fiber<A, E>
```

---

### EFFECT_SINGLE_RUNNER

The `EFFECT_SINGLE_RUNNER` pattern provides a singleton `@effect/cluster` `SingleRunner` backed by an in-memory SQLite database. It manages the task worker entity lifecycle, serializes dispatches by bridge key, and survives multiple concurrent callers sharing the same runtime.

The singleton is initialized lazily on first dispatch and reused for the lifetime of the process.

```ts
import {
  dispatchWorkerTask,
  subscribeTaskWorkerDispatches,
} from "smithers-orchestrator/effect/single-runner";

// Dispatch a registered worker task
const result = await dispatchWorkerTask(task, async () => {
  await doWork();
  return { terminal: true };
});

// Observe dispatches (useful for testing or TUI)
const unsubscribe = subscribeTaskWorkerDispatches((task) => {
  console.log("dispatched", task.executionId);
});
```

```ts
function dispatchWorkerTask(
  task: WorkerTask,
  execute: () => Promise<{ terminal: boolean }>,
): Promise<{ terminal: boolean }>

function subscribeTaskWorkerDispatches(
  subscriber: (task: WorkerTask) => void,
): () => void
```

---

### EFFECT_TASK_RUNTIME

Task-scoped context propagated via `AsyncLocalStorage`. Tools and compute callbacks read this to find the current `runId`, `stepId`, heartbeat handle, and abort signal without threading the values through call stacks.

```ts
import {
  getTaskRuntime,
  requireTaskRuntime,
  withTaskRuntime,
  SmithersTaskRuntime,
} from "smithers-orchestrator/effect/task-runtime";

// Inside a compute callback — access the current runtime
const rt = requireTaskRuntime();
rt.heartbeat({ progress: 0.5 });

// Establish a new runtime scope (used internally by the engine)
const result = withTaskRuntime(
  { runId, stepId, attempt, iteration, signal, db, heartbeat, lastHeartbeat: null },
  () => desc.computeFn!(),
);
```

```ts
type SmithersTaskRuntime = {
  runId: string;
  stepId: string;
  attempt: number;
  iteration: number;
  signal: AbortSignal;
  db: any;
  heartbeat: (data?: unknown) => void;
  lastHeartbeat: unknown | null;
};

function withTaskRuntime<T>(runtime: SmithersTaskRuntime, execute: () => T): T
function getTaskRuntime(): SmithersTaskRuntime | undefined
function requireTaskRuntime(): SmithersTaskRuntime
```

`requireTaskRuntime` throws `SmithersError("TASK_RUNTIME_UNAVAILABLE")` when called outside a task execution scope.

---

## Bridges

Bridges connect the Smithers engine to Effect programs. The engine dispatches execution to a bridge; the bridge translates that into the correct Effect constructs and reports results back.

### EFFECT_ACTIVITY_BRIDGE

The activity bridge wraps legacy task execution in an `@effect/workflow` Activity with idempotency and retry semantics. Each task maps to a `SmithersTaskBridge` workflow instance keyed by adapter namespace, workflow name, run ID, node ID, and iteration.

```ts
import {
  executeTaskActivity,
  makeTaskActivity,
  makeTaskBridgeKey,
  RetriableTaskFailure,
} from "smithers-orchestrator/effect/activity-bridge";

const result = await executeTaskActivity(
  adapter,
  "my-workflow",
  runId,
  desc,
  async (context) => {
    // context.attempt — current attempt number (1-based)
    // context.idempotencyKey — stable key for this attempt
    return computeResult();
  },
  {
    initialAttempt: 1,
    retry: { times: 3, while: (e) => e instanceof RetriableTaskFailure },
  },
);
```

```ts
type TaskActivityContext = {
  attempt: number;
  idempotencyKey: string;
};

type ExecuteTaskActivityOptions = {
  initialAttempt?: number;
  retry?: false | { times: number; while?: (error: unknown) => boolean };
  includeAttemptInIdempotencyKey?: boolean;
};

function executeTaskActivity<A>(
  adapter: SmithersDb,
  workflowName: string,
  runId: string,
  desc: TaskDescriptor,
  executeFn: (context: TaskActivityContext) => Promise<A> | A,
  options?: ExecuteTaskActivityOptions,
): Promise<A>
```

`RetriableTaskFailure` is a sentinel error class. Throw it from an activity to trigger a retry within the bridge's retry loop.

---

### EFFECT_WORKFLOW_BRIDGE

The workflow bridge is the top-level seam that routes a task to the appropriate execution path: `compute`, `static`, or legacy. It manages inflight and completed execution maps to prevent duplicate dispatches across concurrent engine invocations.

```ts
import { executeTaskBridge } from "smithers-orchestrator/effect/workflow-bridge";

await executeTaskBridge(
  adapter, db, runId, desc, descriptorMap,
  inputTable, eventBus, toolConfig, "my-workflow",
  cacheEnabled, signal, disabledAgents,
  runAbortController, hijackState,
  legacyExecuteTaskFn,
);
```

The bridge classifies each task before dispatch:

| Classification | Condition |
|---|---|
| `"compute"` | `desc.computeFn` set, no agent, no cache, no worktree, no scorers |
| `"static"` | `desc.staticPayload` set, no agent, no cache, no worktree, no scorers |
| `"legacy"` | Everything else — forwarded to `legacyExecuteTaskFn` |

---

### EFFECT_WORKFLOW_MAKE_BRIDGE

Wraps an entire workflow body execution in an `@effect/workflow` Workflow using `AsyncLocalStorage` to thread the bridge runtime through the call stack. Used for child workflow execution and continue-as-new semantics.

```ts
import {
  runWorkflowWithMakeBridge,
  withWorkflowMakeBridgeRuntime,
  getWorkflowMakeBridgeRuntime,
} from "smithers-orchestrator/effect/workflow-make-bridge";

const result = await runWorkflowWithMakeBridge(
  workflow,
  { runId, input: { repo: "acme/core" }, resume: false },
  (wf, opts) => engine.run(wf, opts),
);
```

The bridge handles continue-as-new by looping internally until the run settles at a terminal or suspending status. Child workflows registered within the same scope share the parent's `Scope` and engine context.

```ts
function runWorkflowWithMakeBridge<Schema>(
  workflow: SmithersWorkflow<Schema>,
  opts: RunOptions & { runId: string },
  executeBody: (workflow: SmithersWorkflow<Schema>, opts: RunOptions) => Promise<RunResult>,
): Promise<RunResult>

function withWorkflowMakeBridgeRuntime<T>(
  runtime: WorkflowMakeBridgeRuntime,
  execute: () => T,
): T

function getWorkflowMakeBridgeRuntime(): WorkflowMakeBridgeRuntime | undefined
```

---

### EFFECT_COMPUTE_TASK_BRIDGE

Executes a `computeFn` task directly within the bridge — no agent involved. Manages the full attempt lifecycle: DB insert, node state transitions, heartbeat flushing, timeout enforcement, and event emission. Integrates with the heartbeat watchdog when `desc.heartbeatTimeoutMs` is set.

```ts
import { executeComputeTaskBridge } from "smithers-orchestrator/effect/compute-task-bridge";

await executeComputeTaskBridge(
  adapter, db, runId, desc, eventBus,
  { rootDir: "/workspace" },
  "my-workflow",
  signal,
);
```

Eligibility check:

```ts
import { canExecuteBridgeManagedComputeTask } from "smithers-orchestrator/effect/compute-task-bridge";

const eligible = canExecuteBridgeManagedComputeTask(desc, cacheEnabled);
// true when: desc.computeFn set, no agent, no cache, no worktree, no scorers
```

---

### EFFECT_STATIC_TASK_BRIDGE

Executes a `staticPayload` task without invoking any agent or compute function. The payload is validated against the output schema and persisted immediately.

```ts
import { executeStaticTaskBridge } from "smithers-orchestrator/effect/static-task-bridge";

await executeStaticTaskBridge(
  adapter, runId, desc, eventBus,
  { rootDir: "/workspace" },
  "my-workflow",
  signal,
);
```

Eligibility check:

```ts
import { canExecuteBridgeManagedStaticTask } from "smithers-orchestrator/effect/static-task-bridge";

const eligible = canExecuteBridgeManagedStaticTask(desc, cacheEnabled);
// true when: desc.staticPayload set, no agent, no cache, no worktree, no scorers
```

---

### EFFECT_DEFERRED_BRIDGE

Lightweight in-memory deferred resolution map. Stores `Exit` values keyed by `(runId, nodeId, iteration)` and reads them back during replay or resume. Used for simple approval and timer synchronization that does not need durable persistence.

```ts
import {
  makeApprovalDeferred,
  makeTimerDeferred,
  makeDeferredBridgeKey,
  bridgeApprovalResolve,
  bridgeTimerResolve,
  getDeferredResolution,
} from "smithers-orchestrator/effect/deferred-bridge";

// Resolve an approval decision
bridgeApprovalResolve(runId, nodeId, iteration, { approved: true });

// Retrieve the stored resolution
const exit = getDeferredResolution(runId, nodeId, iteration);
```

```ts
function makeDeferredBridgeKey(runId: string, nodeId: string, iteration: number): string
function bridgeApprovalResolve(runId: string, nodeId: string, iteration: number, decision: { approved: boolean }): void
function bridgeTimerResolve(runId: string, nodeId: string, iteration: number): void
function getDeferredResolution(runId: string, nodeId: string, iteration: number): Exit.Exit<...> | undefined
```

---

### EFFECT_DURABLE_DEFERRED_BRIDGE

Durable version of the deferred bridge built on `@effect/workflow DurableDeferred`. Approval and `WaitForEvent` nodes use this so that the resolution is durable across restarts.

Each deferral is keyed by an execution ID derived from the adapter namespace, run ID, node ID, and iteration.

```ts
import {
  awaitApprovalDurableDeferred,
  awaitWaitForEventDurableDeferred,
  bridgeApprovalResolve,
  bridgeWaitForEventResolve,
  bridgeSignalResolve,
  makeDurableDeferredBridgeExecutionId,
} from "smithers-orchestrator/effect/durable-deferred-bridge";

// Wait for an approval decision (called from within a task execution)
const resolution = await awaitApprovalDurableDeferred(adapter, runId, nodeId, iteration);
// resolution.approved, .note, .decidedBy, .decisionJson, .autoApproved

// Resolve it from outside (e.g. from the gateway or HTTP handler)
await bridgeApprovalResolve(adapter, runId, nodeId, iteration, {
  approved: true,
  note: "looks good",
  decidedBy: "alice",
});

// Resolve a WaitForEvent node when a signal arrives
await bridgeWaitForEventResolve(adapter, runId, nodeId, iteration, {
  signalName: "payment.received",
  correlationId: "order-42",
  payloadJson: JSON.stringify({ amount: 100 }),
  seq: 1,
  receivedAtMs: Date.now(),
});

// Resolve all matching WaitForEvent nodes in a run
await bridgeSignalResolve(adapter, runId, {
  signalName: "payment.received",
  correlationId: "order-42",
  payloadJson: JSON.stringify({ amount: 100 }),
  seq: 1,
  receivedAtMs: Date.now(),
});
```

Resolution schemas:

```ts
type ApprovalDurableDeferredResolution = {
  approved: boolean;
  note: string | null;
  decidedBy: string | null;
  decisionJson: string | null;
  autoApproved: boolean;
};

type WaitForEventDurableDeferredResolution = {
  signalName: string;
  correlationId: string | null;
  payloadJson: string;
  seq: number;
  receivedAtMs: number;
};
```

---

### EFFECT_DEFERRED_STATE_BRIDGE

Manages the state machine for timer, approval, and `WaitForEvent` nodes. Reads attempt metadata from the database, determines whether a deferred task is still pending or has already been resolved, and drives the appropriate resolution path.

Key exports:

```ts
import {
  resolveDeferredTaskStateBridge,
  isBridgeManagedTimerTask,
  isBridgeManagedWaitForEventTask,
  cancelPendingTimersBridge,
} from "smithers-orchestrator/effect/deferred-state-bridge";

// Check if a task is bridge-managed
if (isBridgeManagedTimerTask(desc)) { ... }
if (isBridgeManagedWaitForEventTask(desc)) { ... }

// Cancel all pending timers for a run
await cancelPendingTimersBridge(adapter, runId, eventBus);
```

The `resolveDeferredTaskStateBridge` function drives the resolution loop. It reads the current attempt snapshot, handles timer expiry, matches incoming signals, and emits the appropriate `SmithersEvent` when the node settles.

---

### EFFECT_CHILD_WORKFLOW_EXECUTION

Child workflow execution is provided by the `WorkflowMakeBridgeRuntime` context, accessible from within an active workflow body via `getWorkflowMakeBridgeRuntime()`.

```ts
import { getWorkflowMakeBridgeRuntime } from "smithers-orchestrator/effect/workflow-make-bridge";

const bridgeRuntime = getWorkflowMakeBridgeRuntime();
if (bridgeRuntime) {
  const childResult = await bridgeRuntime.executeChildWorkflow(childWorkflow, {
    runId: generateRunId(),
    input: { ...childInput },
  });
}
```

The child workflow is registered as its own Workflow in the shared engine context and executed under the parent's `Scope`. Its continue-as-new loop runs independently but shares the parent engine scope's lifecycle.

---

### EFFECT_WORKER_ENTITY_DISPATCH

Task dispatches pass through an `@effect/cluster` Entity (the `TaskWorkerEntity`). The entity is defined using `@effect/rpc` and sharded via the `SingleRunner`. Each invocation is addressed by its `bridgeKey` — a composite of adapter namespace, workflow name, run ID, node ID, and iteration.

```ts
import {
  TaskWorkerEntity,
  WorkerTask,
  WorkerDispatchKind,
  makeWorkerTask,
} from "smithers-orchestrator/effect/entity-worker";

// Schema types for tasks
type WorkerTask = {
  executionId: string;
  bridgeKey: string;
  workflowName: string;
  runId: string;
  nodeId: string;
  iteration: number;
  retries: number;
  taskKind: "agent" | "compute" | "static";
  dispatchKind: "compute" | "static" | "legacy";
};
```

---

### EFFECT_SANDBOX_ENTITY_TRANSPORT

Sandbox execution (Bubblewrap, Docker, Codeplane) is routed through an Entity transport. The `SandboxEntity` and `SandboxEntityExecutor` bridge the `SandboxTransportService` interface into the cluster entity model.

```ts
import {
  SandboxEntity,
  SandboxEntityExecutor,
  makeSandboxEntityId,
  makeSandboxTransportServiceEffect,
} from "smithers-orchestrator/effect/sandbox-entity";

// Build the Effect layer that provides SandboxTransportService
const transportLayer = makeSandboxTransportServiceEffect(executorLayer);
```

HTTP-backed executors:

```ts
import {
  CodeplaneSandboxExecutorLive,
  DockerSandboxExecutorLive,
  SandboxHttpRunner,
} from "smithers-orchestrator/effect/http-runner";
```

Socket-backed executor:

```ts
import {
  BubblewrapSandboxExecutorLive,
  SandboxSocketRunner,
} from "smithers-orchestrator/effect/socket-runner";
```

---

## Infrastructure

### EFFECT_CHILD_PROCESS

Wraps Node.js `child_process.spawn` in an Effect with full lifecycle management: output capture, truncation, idle timeout, total timeout, `AbortSignal` forwarding, and detached process group cleanup.

```ts
import { spawnCaptureEffect } from "smithers-orchestrator/effect/child-process";
import { runPromise } from "smithers-orchestrator/effect/runtime";

const result = await runPromise(
  spawnCaptureEffect("git", ["diff", "--stat"], {
    cwd: "/workspace",
    timeoutMs: 30_000,
    idleTimeoutMs: 10_000,
    maxOutputBytes: 200_000,
    onStdout: (chunk) => process.stdout.write(chunk),
  }),
);

// result.stdout, result.stderr, result.exitCode
```

```ts
type SpawnCaptureOptions = {
  cwd: string;
  env?: Record<string, string | undefined>;
  input?: string;
  signal?: AbortSignal;
  timeoutMs?: number;
  idleTimeoutMs?: number;
  maxOutputBytes?: number;       // default: 200_000 bytes
  detached?: boolean;
  onStdout?: (chunk: string) => void;
  onStderr?: (chunk: string) => void;
};

type SpawnCaptureResult = {
  stdout: string;
  stderr: string;
  exitCode: number | null;
};

function spawnCaptureEffect(
  command: string,
  args: string[],
  options: SpawnCaptureOptions,
): Effect.Effect<SpawnCaptureResult, SmithersError>
```

Output exceeding `maxOutputBytes` is truncated and a `smithers.tool.output_truncated_total` metric is incremented. The process group is killed with `SIGKILL` on abort or timeout.

---

### EFFECT_INTEROP

Utilities for wrapping non-Effect code so it fits cleanly into Effect pipelines.

```ts
import {
  fromPromise,
  fromSync,
  ignoreSyncError,
  toError,
} from "smithers-orchestrator/effect/interop";

// Wrap a promise-returning function
const effect = fromPromise("fetch user", () => fetch("/api/user").then(r => r.json()));

// Wrap a synchronous function that may throw
const syncEffect = fromSync("parse JSON", () => JSON.parse(raw));

// Best-effort cleanup — swallows thrown errors
const cleanup = ignoreSyncError("close db", () => db.close());
```

```ts
type ErrorWrapOptions = {
  code?: SmithersErrorCode;
  details?: Record<string, unknown>;
};

function fromPromise<A>(
  label: string,
  evaluate: () => PromiseLike<A>,
  options?: ErrorWrapOptions,
): Effect.Effect<A, SmithersError>

function fromSync<A>(
  label: string,
  evaluate: () => A,
  options?: ErrorWrapOptions,
): Effect.Effect<A, SmithersError>

function ignoreSyncError(label: string, fn: () => void): Effect.Effect<void>

function toError(cause: unknown, label?: string, options?: ErrorWrapOptions): SmithersError
```

All failures are normalized to `SmithersError`. Pass `code` to set a specific `SmithersErrorCode`; without it the code defaults to `"INTERNAL_ERROR"`.

---

### EFFECT_SQL_MESSAGE_STORAGE

Provides a SQLite-backed implementation of `@effect/workflow`'s message storage interface. Used by the workflow engine to persist workflow state across process restarts.

```ts
import {
  SqlMessageStorage,
  ensureSqlMessageStorage,
  ensureSqlMessageStorageEffect,
  getSqlMessageStorage,
} from "smithers-orchestrator/effect/sql-message-storage";

// Ensure message storage is initialized for a given database
const storage = await ensureSqlMessageStorage(db);

// Effect version
const storageEffect = ensureSqlMessageStorageEffect(db);
```

The storage creates and manages the `_smithers_runs`, `_smithers_nodes`, and `_smithers_attempts` tables plus supporting indices. The `getSqlMessageStorage(db)` function returns an existing instance without creating one.

---

### EFFECT_LOGGING

Thin wrappers around `Effect.logDebug/Info/Warning/Error` that fire-and-forget via `runFork`. Each accepts an optional annotations map and an optional log span name.

```ts
import {
  logDebug,
  logInfo,
  logWarning,
  logError,
} from "smithers-orchestrator/effect/logging";

logInfo("workflow started", { runId, workflowName: "deploy" }, "engine:run");

logError("task failed", { runId, nodeId, error: err.message }, "engine:task");
```

```ts
type LogAnnotations = Record<string, unknown> | undefined;

function logDebug(message: string, annotations?: LogAnnotations, span?: string): void
function logInfo(message: string, annotations?: LogAnnotations, span?: string): void
function logWarning(message: string, annotations?: LogAnnotations, span?: string): void
function logError(message: string, annotations?: LogAnnotations, span?: string): void
```

---

### EFFECT_LOG_FORMATS_JSON_PRETTY_LOGFMT

Log format selection is controlled via `SmithersObservabilityOptions.logFormat`. The observability layer applies the chosen format to the shared runtime layer:

| Format | Description |
|---|---|
| `"json"` | Structured JSON lines — suitable for log aggregation pipelines |
| `"pretty"` | Human-readable colored output for local development |
| `"logfmt"` | Key=value logfmt — compatible with Loki and similar systems |

Configure via the observability API:

```ts
import { createSmithers } from "smithers-orchestrator";

const { smithers } = createSmithers({ ... });
// then in your run call:
await smithers.run(workflow, {
  input: { ... },
  observability: { logFormat: "json" },
});
```

---

### EFFECT_METRICS

Pre-defined Effect `Metric` instances for every significant system boundary in Smithers. Import individual metrics and compose them with `Metric.increment`, `Metric.update`, `Metric.set`, or `Metric.tagged`.

**Counters (selection)**

| Name | Metric |
|---|---|
| `runsTotal` | `smithers.runs.total` |
| `nodesStarted` / `nodesFinished` / `nodesFailed` | `smithers.nodes.*` |
| `toolCallsTotal` / `toolCallErrorsTotal` | `smithers.tool_calls.*` |
| `approvalsRequested` / `approvalsGranted` / `approvalsDenied` | `smithers.approvals.*` |
| `tokensInputTotal` / `tokensOutputTotal` | `smithers.tokens.*` |
| `tokensContextWindowBucketTotal` | `smithers.tokens.context_window_bucket_total` |
| `runsFinishedTotal` / `runsFailedTotal` / `runsCancelledTotal` | `smithers.runs.*_total` |
| `errorsTotal` | `smithers.errors.total` |
| `sandboxCreatedTotal` / `sandboxCompletedTotal` | `smithers.sandbox.*` |

**Gauges (selection)**

| Name | Metric |
|---|---|
| `activeRuns` / `activeNodes` | `smithers.runs.active`, `smithers.nodes.active` |
| `schedulerQueueDepth` | `smithers.scheduler.queue_depth` |
| `approvalPending` | `smithers.approval.pending` |
| `timersPending` | `smithers.timers.pending` |
| `processMemoryRssBytes` / `processHeapUsedBytes` | `smithers.process.*` |

**Histograms (selection)**

| Name | Metric |
|---|---|
| `nodeDuration` / `attemptDuration` | `smithers.node.duration_ms`, `smithers.attempt.duration_ms` |
| `toolDuration` | `smithers.tool.duration_ms` |
| `runDuration` | `smithers.run.duration_ms` |
| `tokensInputPerCall` / `tokensOutputPerCall` | `smithers.tokens.*_per_call` |
| `tokensContextWindowPerCall` | `smithers.tokens.context_window_per_call` |
| `sandboxDurationMs` / `sandboxBundleSizeBytes` | `smithers.sandbox.*` |
| `heartbeatDataSizeBytes` / `heartbeatIntervalMs` | `smithers.heartbeats.*` |

```ts
import { Effect, Metric } from "effect";
import {
  toolCallsTotal,
  toolDuration,
} from "smithers-orchestrator/effect/metrics";

// Record a tool call with tags
const record = Effect.all([
  Metric.increment(Metric.tagged(toolCallsTotal, "tool", "bash")),
  Metric.update(Metric.tagged(toolDuration, "tool", "bash"), durationMs),
], { discard: true });
```

`trackEvent(event: SmithersEvent)` is the high-level entry point — it maps every event type to the correct set of metric updates:

```ts
import { trackEvent } from "smithers-orchestrator/effect/metrics";

runFork(trackEvent({ type: "NodeStarted", runId, nodeId, ... }));
```

`updateProcessMetrics()` snapshots `process.memoryUsage()` and process uptime into gauges. Call it on a recurring interval from a background fiber.

---

### EFFECT_DIFF_BUNDLE_COMPUTE

Produces a serializable diff bundle by comparing a git working tree against a base ref. Captures tracked file changes, binary files, and untracked files.

```ts
import { computeDiffBundle, DiffBundle, FilePatch } from "smithers-orchestrator/effect/diff-bundle";

const bundle = await computeDiffBundle("HEAD", "/workspace");
// bundle.seq       — sequence number (default: 1)
// bundle.baseRef   — the ref passed in ("HEAD")
// bundle.patches   — FilePatch[]
```

```ts
type FilePatch = {
  path: string;
  operation: "add" | "modify" | "delete";
  diff: string;
  binaryContent?: string;  // base64-encoded for binary files
};

type DiffBundle = {
  seq: number;
  baseRef: string;
  patches: FilePatch[];
};

function computeDiffBundle(
  baseRef: string,
  currentDir: string,
  seq?: number,
): Promise<DiffBundle>
```

---

### EFFECT_DIFF_BUNDLE_APPLY

Applies a previously computed `DiffBundle` to a target directory. Attempts `git apply` first; falls back to per-patch application using the `diff` library for text patches and direct file writes for binary patches.

```ts
import { applyDiffBundle } from "smithers-orchestrator/effect/diff-bundle";

await applyDiffBundle(bundle, "/sandbox/workspace");
```

```ts
function applyDiffBundle(
  bundle: DiffBundle,
  targetDir: string,
): Promise<void>
```

The target directory is created recursively if it does not exist. Delete operations remove the target file silently.

---

### EFFECT_SCHEDULER_WAKE_QUEUE

A lightweight notify/wait queue used to wake the scheduler when new work becomes available, avoiding busy polling.

```ts
import { createSchedulerWakeQueue } from "smithers-orchestrator/effect/workflow-make-bridge";

const queue = createSchedulerWakeQueue();

// Background: notify when new tasks are ready
queue.notify();

// Scheduler loop: wait for the next notification
await queue.wait();
```

```ts
type SchedulerWakeQueue = {
  notify(): void;
  wait(): Promise<void>;
};

function createSchedulerWakeQueue(): SchedulerWakeQueue
```

`notify()` resolves a pending `wait()` immediately, or increments an internal pending counter so the next `wait()` call returns without suspending. Multiple `notify()` calls before a `wait()` are coalesced into a single wakeup.

---

### EFFECT_WORKFLOW_VERSIONING_RUNTIME

Manages workflow patch decisions for safe migration of long-running workflows. A patch decision is a boolean recorded against a string `patchId`. New runs see `true`; runs that pre-date the patch see `false` (as stored in the run config).

```ts
import {
  createWorkflowVersioningRuntime,
  withWorkflowVersioningRuntime,
  getWorkflowVersioningRuntime,
  usePatched,
  getWorkflowPatchDecisions,
} from "smithers-orchestrator/effect/versioning";

// Create a runtime for the current run
const versioningRuntime = createWorkflowVersioningRuntime({
  baseConfig: run.configJson ?? {},
  initialDecisions: getWorkflowPatchDecisions(run.configJson),
  isNewRun: !opts.resume,
  persist: async (config) => { await adapter.updateRunConfig(runId, config); },
});

// Wrap the workflow body execution
const result = withWorkflowVersioningRuntime(versioningRuntime, () =>
  engine.executeBody(workflow, opts),
);

// Flush decisions to the database after each render frame
await versioningRuntime.flush();
```

Inside a workflow (JSX or TOON), use the `usePatched` hook to branch on a patch:

```ts
import { usePatched } from "smithers-orchestrator/effect/versioning";

function MyWorkflow({ ctx }) {
  const hasNewRetryLogic = usePatched("2026-04-retry-overhaul");

  return (
    <Workflow name="deploy">
      <Task id="run" retries={hasNewRetryLogic ? 3 : 0}>
        Deploy
      </Task>
    </Workflow>
  );
}
```

```ts
type WorkflowVersioningRuntime = {
  resolve(patchId: string): boolean;
  flush(): Promise<void>;
  snapshot(): WorkflowPatchDecisions;
};

function createWorkflowVersioningRuntime(options: WorkflowVersioningRuntimeOptions): WorkflowVersioningRuntime
function withWorkflowVersioningRuntime<T>(runtime: WorkflowVersioningRuntime, execute: () => T): T
function getWorkflowVersioningRuntime(): WorkflowVersioningRuntime | undefined
function getWorkflowPatchDecisions(config: Record<string, unknown> | null | undefined): WorkflowPatchDecisions
function usePatched(patchId: string): boolean
```

`usePatched` calls `resolve()` on the ambient `WorkflowVersioningRuntime`. Outside a versioning scope it always returns `false`.