// Auto-generated by scripts/generate-docs-index.ts - DO NOT EDIT export const EMBEDDED_DOC_FILENAMES: readonly string[] = ["ERRATA-GPT5-HARMONY.md","ai-schema-normalize.md","auth-broker-gateway.md","bash-tool-runtime.md","blob-artifact-architecture.md","compaction.md","config-usage.md","custom-tools.md","environment-variables.md","extension-loading.md","extensions.md","fs-scan-cache-architecture.md","gemini-manifest-extensions.md","handoff-generation-pipeline.md","hooks.md","install-id.md","marketplace.md","mcp-config.md","mcp-protocol-transports.md","mcp-runtime-lifecycle.md","mcp-server-tool-authoring.md","memory.md","models.md","natives-addon-loader-runtime.md","natives-architecture.md","natives-binding-contract.md","natives-build-release-debugging.md","natives-media-system-utils.md","natives-rust-task-cancellation.md","natives-shell-pty-process.md","natives-text-search-pipeline.md","non-compaction-retry-policy.md","notebook-tool-runtime.md","plugin-manager-installer-plumbing.md","porting-from-pi-mono.md","porting-to-natives.md","provider-streaming-internals.md","python-repl.md","render-mermaid.md","resolve-tool-runtime.md","rpc.md","rulebook-matching-pipeline.md","sdk.md","secrets.md","session-operations-export-share-fork-resume.md","session-switching-and-recent-listing.md","session-tree-plan.md","session.md","skills.md","skills/authoring-extensions.md","skills/authoring-hooks.md","skills/authoring-marketplaces.md","skills/examples/hello-extension/README.md","skills/examples/mini-marketplace/README.md","skills/examples/safety-hook/README.md","slash-command-internals.md","task-agent-discovery.md","theme.md","tools/ask.md","tools/ast-edit.md","tools/ast-grep.md","tools/bash.md","tools/browser.md","tools/calc.md","tools/checkpoint.md","tools/debug.md","tools/edit.md","tools/eval.md","tools/find.md","tools/github.md","tools/inspect_image.md","tools/irc.md","tools/job.md","tools/lsp.md","tools/read.md","tools/recall.md","tools/recipe.md","tools/reflect.md","tools/render_mermaid.md","tools/resolve.md","tools/retain.md","tools/rewind.md","tools/search.md","tools/search_tool_bm25.md","tools/ssh.md","tools/task.md","tools/todo_write.md","tools/web_search.md","tools/write.md","tree.md","ttsr-injection-lifecycle.md","tui-runtime-internals.md","tui.md"]; export const EMBEDDED_DOCS: Readonly> = { "ERRATA-GPT5-HARMONY.md": "# ERRATA — GPT-5 Harmony-Header Leakage\n\n## 1. The problem\n\nOpenAI frames tool calls in the Harmony chat protocol:\n\n```\n<|start|>assistant<|channel|>commentary to=functions.<|message|>{ARGS}<|call|>\n```\n\n`<|channel|>commentary to=functions.NAME` is the **routing header** —\ncontrol tokens consumed by the runtime to dispatch the call. These\ntokens never appear as content under normal operation; the runtime\nstrips them.\n\nThe defect: gpt-5 models occasionally emit, **as ordinary content\ninside `{ARGS}`**, the **plain-text shadow** of these routing tokens —\nthe same characters without the `<|…|>` brackets — and continue\nproducing more pseudo-routing structure (channel name, body marker,\nmultilingual spam, fake tool-result framing). The contamination lives\ninside the visible tool argument and is dispatched to the tool as if it\nwere intended content.\n\n**Critical detail.** The actual `<|start|>` / `<|channel|>` /\n`<|message|>` / `<|call|>` special tokens almost never appear in tool\nargs. What leaks is the bracket-less spelling — `analysis to=functions.X\ncode …` — because OpenAI applies a logit mask suppressing the\ncontrol-token IDs inside the args region. The mass that would have gone\nto those special tokens redistributes onto the un-bracketed plain-text\nrepresentation the model also learned. This makes the leak structurally\ninvisible to the routing parser and lands it in the tool input verbatim.\n\nManifestation in tool args (real corpus example):\n\n```\n~ add_function(iso, ctx, ns, \"installSystemChangeObserver\",\n os_install_system_change_observer);】【\"】【analysis to=functions.edit\n code above เงินไทยฟรีuser to=functions.edit code …\n```\n\nThe leading code is real and intended. Everything after the first\nnon-Latin token through the next clean structural boundary is corruption.\n\n---\n\n## 2. Observed statistics & failure modes\n\nSource: `~/.omp/stats.db` (`ss_tool_calls`, `ss_assistant_msgs`), through\n2026-05-10. 1.05M tool calls scanned.\n\n### 2.1 Rate\n\n| Model | Leaks in tool args | Calls | per million |\n|------------------|-------------------:|--------:|------------:|\n| gpt-5.4 | 37 | 226,957 | 163 |\n| gpt-5.3-codex | 17 | 112,243 | 151 |\n| gpt-5.5 | 2 | 80,750 | 25 |\n| gpt-5.2-codex | 0 | — | — |\n\nPlus 15 hits in assistant visible text / thinking blobs.\n\n### 2.2 Tool distribution\n\n| Tool | Hits |\n|---------------------|-----:|\n| `edit` | 38 |\n| `eval` | 11 |\n| `report_tool_issue` | 3 |\n| `grep`/`read`/`search`/`yield` | 1 each |\n\nConcentrated in tools with free-form (non-JSON-schema) argument formats.\n\n### 2.3 Leak shape (deterministic)\n\n```\nLEAK ::= JUNK_PREFIX MARKER CHANNEL_BODY (LEAK)?\nMARKER ::= \"to=functions.\" TOOL_NAME\nCHANNEL_BODY ::= \" code \" (SPAM | reasoning_prose | fake_tool_output)*\nJUNK_PREFIX ::= (GLITCH_TOKEN | CHANNEL_WORD | NON_LATIN_RUN | \"}\" | \"】【\")+\n```\n\n**Cascading is common.** Of 96 marker occurrences across 71 contaminated\nrecords, 39 contain ≥2 markers and 7 contain ≥3 — the model emits\nmultiple fake `to=functions.X code …` blocks back-to-back, often with\nfake `code_output\\nCell N:\\n…` framing between them. Once the\nplain-text scaffolding is in the residual stream, the prefix now *looks\nlike* a fresh tool envelope start, so the macro prior over continuations\nkeeps voting for more scaffolding. Self-amplifying.\n\n### 2.4 Glitch tokens\n\nSingle-token identifiers in `o200k_base` whose embeddings appear to be\nnear-init from underrepresentation in post-training. ASCII residue\nimmediately before the marker in the natural corpus:\n\n| Surface string | Single-token | Token ID | Hits in corpus |\n|-------------------|:-:|---------:|---:|\n| `Japgolly` | ✅ | 199,745 | 1 |\n| `Jsii` | ✅ | 114,318 | (subtoken of `Jsii_commentary`) |\n| `Jsii_commentary` | — (3 toks) | — | 2 |\n| `changedFiles` | — (2 toks) | — | 8 |\n| `RTLU` | — (2 toks) | — | 3 |\n\n`Japgolly` is in the last 0.13% of the vocabulary — the same family of\nGitHub-corpus residue that produced `SolidGoldMagikarp` in the 2023\nGPT-2 vocabulary (Rumbelow & Watkins). `SolidGoldMagikarp` itself\ntokenizes to 5 tokens in `o200k_base` — that specific token was retired,\nbut the class wasn't.\n\nFor the multi-token entries, the corpus-level signature is the surface\nstring; the underlying glitch trigger is a sub-token (e.g. `Jsii` inside\n`Jsii_commentary`). The detector list (`G` signal) keys on the surface\nstrings.\n\nStable across unrelated sessions. Treated as a high-precision detector\nsignal.\n\n### 2.5 Channel-word leakage\n\n`analysis` (5), `assistant` (5), `commentary` (3), `user` (1) appear\ndirectly preceding `to=`. Always bare words; never `<|channel|>analysis`\nor any other bracketed form. Consistent with §1 — the brackets are\nmasked, the words are not.\n\n### 2.6 Non-Latin spam residue\n\n96 marker hits, by script: CJK 40, Cyrillic 12, Telugu/Kannada/Malayalam\n18, Thai 8, Georgian 7, Armenian 7, Arabic 1. Recurring fragments are\nChinese gambling SEO (`大发时时彩`, `天天中彩票`), Georgian/Abkhaz junk,\nand Thai casino spam — well-known low-quality crawl residue.\n\nThis is the same script distribution observed in the controlled\nreproduction (§7.3), independent of the prompt's natural language.\n\n### 2.7 Failure-mode breakdown for the `edit` tool\n\nThe `edit` tool exists in two variants in the corpus:\n\n| Variant | Calls | Recovery |\n|--------------------------|------:|----------|\n| Patch-DSL (`§PATH`/anchor/`«»≔` ops) | 27 | **Recoverable** by op-truncation (§3.3) |\n| JSON-schema (`{path,edits:[…]}`) | 11 | **Not recoverable** — contamination is escaped *inside* JSON strings, parser accepts it cleanly, content would be written verbatim into source files |\n\nFor Patch-DSL leaks specifically:\n\n- 20/27 cases: contamination on the last input line; nothing follows.\n- 7/27 cases: contamination mid-input; what follows is one of: a\n duplicate replay of an earlier file/anchor, intended content for a\n *different* tool call (the model started its next call inline), or\n pure hallucination. Post-contamination content is never trustworthy.\n\n### 2.8 Mechanism (confirmed)\n\n**Prior collapse from null-embedding glitch tokens, into a\ncontrol-token-masked basin whose mass redistributes onto the\nplain-text shadow of the Harmony protocol.**\n\nStep by step:\n\n1. The model is mid-`{ARGS}` of a Harmony tool call. The runtime applies\n a logit mask suppressing structural control tokens (`<|channel|>`,\n `<|message|>`, `<|call|>`, `<|start|>`, `<|end|>`) inside the args\n region. Without this mask, normal generation would constantly\n hallucinate envelope-closes; with it, those token IDs have logit\n `-∞` in args.\n2. A glitch token `g` is sampled. By construction `g` was in the BPE\n merge corpus but barely in LM/RL training, so its **input embedding\n `e_g` ≈ near-init noise of small norm**.\n3. At position t+1, the residual update `h_{t+1} ≈ LN(h_t + e_g + Attn +\n MLP)` is dominated by the prefix-derived terms; the just-emitted-token\n signal is effectively absent. Generation diversity normally comes\n from `e_x` steering the residual into different sub-regions —\n stripped here.\n4. The next-token distribution therefore collapses onto the **conditional\n prior over continuations of the prefix, with local conditioning\n removed**. In a tool-calling rollout context, that prior is sharply\n peaked on Harmony scaffolding (control tokens + routing tokens) —\n that's what RL trained.\n5. The mask zeros the control-token IDs. Mass redistributes onto the\n **next-best continuation**: the un-bracketed surface-form spelling of\n the same protocol (`analysis`, `commentary`, ` to=functions.X`,\n ` code `). This spelling is unmasked because those characters are\n ordinary tokens.\n6. Once a few tokens of plain-text scaffolding land in the residual\n stream, the prefix now resembles a fresh envelope start. The macro\n prior keeps voting for more scaffolding. Cascading (§2.3) follows.\n7. Multilingual spam after the marker is the same prior-collapse\n continuation, drawn from the training neighborhood of the glitch\n token (often ESL/auto-generated multilingual web junk — exactly the\n crawl residue in §2.6).\n\n**Two corollaries the corpus data demanded but only the experiment\nexplained:**\n\n- **The brackets never appear** (§1, §2.5). The mask is what makes the\n leak land in plain text instead of as a real envelope-close.\n- **Counterintuitive grammar dependency** (§7.4). The leak is *worse* in\n formats closest to OpenAI's training distribution. Off-distribution\n custom grammars dampen the macro-prior basin; the official\n `*** Begin Patch` format is the strongest collapse target.\n\nThe 2023 SolidGoldMagikarp paper documented mechanism (1)+(2)+(4). The\nnew piece is (5): when constrained decoding masks the natural collapse\ntarget, the mass laundered through the un-masked plain-text shadow\nbecomes a structurally-invisible exfiltration channel.", "ai-schema-normalize.md": "# AI tool-schema normalization\n\n`@oh-my-pi/pi-ai` exposes one unified schema normalizer that providers consume\nbefore tools are sent on the wire. All walkers live in\n`packages/ai/src/utils/schema/normalize.ts`; the operational contract is\n`packages/ai/src/utils/schema/CONSTRAINTS.md`.\n\nThere is no separate `strict-mode.ts` module any more — OpenAI strict-mode\nsanitization, OpenAI Responses `oneOf` rewriting, Google/Vertex/Gemini-CLI\nsanitization, Cloud Code Assist Claude sanitization, and MCP sanitization all\nshare the same option-driven walk.\n\n## Entry points\n\nAll exports live under `@oh-my-pi/pi-ai/utils/schema`:\n\n- `normalizeSchema(value, options)` — generic option-driven walker.\n- `normalizeSchemaForGoogle(value)` — Gemini / Vertex / Gemini CLI.\n- `normalizeSchemaForCCA(value)` — Cloud Code Assist Claude (Antigravity + GCA).\n- `normalizeSchemaForMCP(value)` — MCP inputSchemas before they enter the\n custom-tool registry. `tool-bridge.ts` runs every MCP `inputSchema` through\n this dispatcher.\n- `normalizeSchemaForOpenAIResponses(schema)` (alias\n `sanitizeSchemaForOpenAIResponses`) — rewrites `oneOf` → `anyOf` for the\n Responses family.\n- `sanitizeSchemaForStrictMode(schema)` and\n `enforceStrictSchema(schema)` / `tryEnforceStrictSchema(schema)` — the\n OpenAI strict-mode pipeline (sanitize → enforce). All three are exported\n from `normalize.ts`.\n- `adaptSchemaForStrict(schema, strict)` from `./adapt` — thin composer that\n wraps `tryEnforceStrictSchema` for provider call sites and consults\n `PI_NO_STRICT` (env `PI_NO_STRICT`) for the global bypass.\n\nRemoved in the unified-flow refactor:\n\n- `strict-mode.ts` (merged into `normalize.ts`).\n- `sanitize-google.ts` and `normalize-cca.ts` (replaced by\n `normalizeSchemaFor*` dispatchers).\n- `StringEnum` helper — use `z.enum([...])` directly; Zod's emitted JSON\n Schema is already wire-compatible with Google and other providers.\n- `sanitizeSchemaFor{Google,CCA,MCP}` / `prepareSchemaForCCA` — renamed to\n `normalizeSchemaFor{Google,CCA,MCP}`.\n\n## Dispatcher mapping\n\n| Provider transport(s) | Dispatcher |\n| -------------------------------------------------------------------- | -------------------------------------------- |\n| `openai-completions`, `openai-responses`, `openai-codex-responses` | `adaptSchemaForStrict` (sanitize + enforce) |\n| `openai-responses` family (`oneOf` → `anyOf` only) | `normalizeSchemaForOpenAIResponses` |\n| `google-generative-ai`, `google-vertex`, Gemini CLI | `normalizeSchemaForGoogle` |\n| Cloud Code Assist Claude (Antigravity + GCA, `claude-*` model ids) | `normalizeSchemaForCCA` |\n| MCP `inputSchema` ingestion | `normalizeSchemaForMCP` |\n| `anthropic-messages` (native, not CCA) | per-provider whitelist in `anthropic.ts` |\n\nGemini CLI / Antigravity CCA MUST run the full `normalizeSchemaForCCA`\npipeline (not just the first keyword-stripping pass) to keep parity with the\nshared Google Claude path.\n\n## Walk semantics\n\n`normalizeSchema` first upgrades the input to JSON Schema 2020-12, then\nwalks the tree with the option set pinned by the dispatcher. Each node:\n\n1. Inlines `$ref` (see \"Edge cases\" below).\n2. Renames `snake_case` combinator/property keys to camelCase\n (`any_of` → `anyOf`, etc.; collisions follow python-genai\n `pop(from)`/`set(to)` semantics — snake_case wins).\n3. Applies the `handle_null_fields` collapse for nullable unions before\n recursing into children.\n4. Strips keys the target provider does not support, optionally lifting\n human-meaningful keys (`pattern`, `format`, min/max, `default`,\n `examples`, ...) into the sibling `description` via the spill formatter\n (`spill.ts`). Structural/meta keys (`$ref`, `$defs`,\n `additionalProperties`) are not spilled.\n5. Normalizes type unions (`type: [\"T\", \"null\"]` → `type: \"T\"` + nullable\n marker on Google, plain `type: \"T\"` on CCA).\n6. Collapses object-only / same-type combiners, optionally lossy-collapses\n mixed-type combiners (CCA only), and runs the residual-combiner fixpoint.\n7. Validates against AJV 2020 when `validateAndFallback` is set (CCA path)\n and emits the per-tool fallback `{ \"type\": \"object\", \"properties\": {} }`\n on residual incompatibility — `type` array, `type: \"null\"`, `nullable`\n key, or any remaining `anyOf`/`oneOf`/`allOf`.\n\n## OpenAI strict-mode pipeline\n\n`adaptSchemaForStrict(schema, strict)` runs `tryEnforceStrictSchema`,\nwhich composes:\n\n1. **Sanitize** (`sanitizeSchemaForStrictMode`): strips non-structural\n keywords (`format`, `pattern`, min/max, `examples`, `default`,\n `if`/`then`/`else`, `not`, `unevaluated*`, `patternProperties`,\n `dependent*`, `content*`, `min/maxProperties`, `$dynamicRef`, etc.). The\n `default` value is inlined into the sibling `description` as\n ` (default: X)` before being dropped, unless `description` already\n contains `(default:` or no `description` exists.\n2. **Enforce** (`enforceStrictSchema`): every object node gets\n `additionalProperties: false`, every property goes into `required`, and\n optional properties become nullable unions\n (`anyOf: [, { \"type\": \"null\" }]`). Tuple `prefixItems` are\n strictified recursively.\n\nThe two passes share node-level caches and the same epoch-based cycle\nguard, so a single walk on the wire path normalizes refs, allOf, and\nnullable wrapping consistently. `tryEnforceStrictSchema` is fail-open:\nif anything throws, it returns `{ strict: false, schema: original }` so\ncallers MUST emit `strict: true` only when enforcement actually succeeded.\n\n### Edge cases the strict-mode normalizer handles\n\n- **Local `$ref` inlining.** OpenAI strict mode rejects\n `{ \"$ref\": \"...\", \"description\": \"...\" }` with sibling keys. The\n sanitizer pre-resolves local `#/...` refs against the root and merges\n with **sibling keys winning** over the resolved def — same precedence\n as `openai-python`'s `_ensure_strict_json_schema`. Recursive refs are\n guarded by the per-walk epoch.\n- **Single-item `allOf`.** A `{ \"allOf\": [X], ...siblings }` collapses to\n `{ ...X, ...siblings }` with the inlined entry's keys winning over the\n original siblings (matches `openai-python`'s `_pydantic.py:79-83`). Multi-\n item `allOf` is left intact for the downstream validator to reject if\n needed.\n- **Type-array branches and nullable unions.** When a node has\n `type: [\"T\", \"U\"]`, the sanitizer emits one variant schema per type,\n pruning type-specific keywords (e.g. `properties`/`required` only stay on\n the `object` variant, `items` only on the `array` variant). The shared\n `description` is **hoisted onto the `anyOf` wrapper** instead of being\n duplicated on every branch — so a strict nullable union becomes\n `{ anyOf: [T, { type: \"null\" }], description: \"...\" }`, not\n `anyOf: [{ ..., description }, { ..., description }]`.\n- **Enum/const without a `type`.** Both sanitize and enforce paths call\n `inferStrictPrimitiveTypeFromEnumOrConst` to infer the primitive `type`\n from `enum` / `const` values. Mixed-primitive enums (`[1, \"two\", null]`),\n enums containing objects/arrays, and non-primitive `const` values\n (`{a:1}`, `[1,2,3]`) cannot be described by a single `type` keyword and\n trigger the strict-mode fail-open path — emitting a typeless schema\n would just be rejected on the wire by OpenAI.\n\n## Performance: static fingerprint cache\n\n`resolveProviderModels` in `packages/ai/src/model-manager.ts` and\n`readModelCache`/`writeModelCache` in `model-cache.ts` cooperate via a\nschema-v3 `static_fingerprint` column on the `model_cache` SQLite table.\n\n- `fingerprintStatic(staticModels)` hashes the static catalog slice\n (`Bun.hash(JSON.stringify(models))` in base36) and memoizes the result\n in a per-process `WeakMap` keyed by the array reference. Multiple\n cold-start arms calling `resolveProviderModels` with the same\n `staticModels` array pay the JSON+hash cost once.\n- On cache read, if the network fetch is being skipped, the cached row is\n fresh + authoritative, and the cached `static_fingerprint` matches the\n current one, `resolveProviderModels` returns the cached models verbatim\n — the cache already incorporates the same static state, so re-running\n `mergeDynamicModels(static, cache)` would just rebuild the same objects.\n- `mergeModelSources` and `mergeDynamicModels` short-circuit on\n empty-source inputs (the common shape after `(static, [])` or for\n providers without a static catalog), avoiding Map churn entirely.\n\nCache rows written before schema v3 are dropped by the cache-version\ncheck; the column defaults to `''` for any row that survives a version\nupgrade so the fingerprint-equality check naturally fails closed and the\nfull merge re-runs.\n\n## Related\n\n- `docs/models.md` — registry, equivalence, compat flags\n (`supportsStrictMode`, `toolStrictMode`, `disableStrictTools`).\n- `docs/provider-streaming-internals.md` — how the normalized schemas are\n used downstream during the provider stream loop.\n- `docs/mcp-server-tool-authoring.md` — MCP `inputSchema` ingestion via\n `normalizeSchemaForMCP`.\n- `packages/ai/src/utils/schema/CONSTRAINTS.md` — operational contract for\n every normalization rule.\n", "auth-broker-gateway.md": "# Auth Broker and Auth Gateway\n\nThe auth broker and auth gateway are two cooperating HTTP services that move OAuth refresh tokens and provider access tokens off developer laptops and into a single broker host.\n\n- **`omp auth-broker serve`** holds the canonical SQLite credential vault, performs OAuth refreshes, and exposes a small REST API (`/v1/snapshot`, `/v1/credential/:id/refresh`, `/v1/credential/:id/disable`, `/v1/credential`, `/v1/usage`, `/v1/healthz`).\n- **`omp auth-gateway serve`** is a forward-proxy. It accepts OpenAI Chat Completions, Anthropic Messages, and OpenAI Responses requests, injects the broker-resolved access token, and forwards the bytes to the real provider. Clients (containerised omp, llm-git, the macOS usage widget, …) never see the access token.\n\nTransport security between operator, broker, and gateway is delegated to the operator (Tailscale / Wireguard / reverse proxy + TLS). Every endpoint except `/v1/healthz` (broker) and `/healthz` (gateway) requires a bearer token.\n\nSource: `packages/ai/src/auth-broker/`, `packages/ai/src/auth-gateway/`, `packages/coding-agent/src/cli/auth-broker-cli.ts`, `packages/coding-agent/src/cli/auth-gateway-cli.ts`, `packages/coding-agent/src/session/auth-broker-config.ts`.\n\n## Data flow\n\n```\n ┌────────────────────────────────────────────────────────────┐\n │ broker host │\n │ │\n developer ──▶ │ ┌──────────────────────────┐ ┌────────────────────┐ │\n laptop / │ │ omp auth-broker serve │◀──▶│ SQLite agent.db │ │\n CI / robomp │ │ - holds refresh tokens │ │ (canonical writer)│ │\n │ │ - background refresher │ └────────────────────┘ │\n │ │ /v1/{snapshot,refresh,…}│ │\n │ └─────────┬────────────────┘ │\n │ │ bearer ($CONFIG_DIR/auth-broker.token) │\n │ ▼ │\n │ ┌──────────────────────────┐ │\n │ │ omp auth-gateway serve │ RemoteAuthCredentialStore │\n │ │ /v1/{chat,messages,…} │ pulls /v1/snapshot at boot, │\n │ │ /v1/usage, /v1/models │ refreshes credentials by id │\n │ └─────────┬────────────────┘ via the broker on expiry │\n └────────────┼───────────────────────────────────────────────┘\n │ bearer ($CONFIG_DIR/auth-gateway.token)\n ▼\n unauthenticated clients\n (llm-git, macOS widget, robomp containers, IDE plugins, …)\n │\n ▼ same path is forwarded with Authorization\n api.anthropic.com / api.openai.com / …\n```\n\nThe broker is the only writer of OAuth refresh tokens. Clients (including the gateway itself) load a redacted snapshot in which every `refresh` field has been replaced with `REMOTE_REFRESH_SENTINEL`; when an access token expires the client calls `POST /v1/credential/:id/refresh` and the broker performs the refresh server-side. `RemoteAuthCredentialStore` rejects any local code path that tries to write through it, with an error pointing at `omp auth-broker login` / `omp auth-broker logout`.\n\n## auth-broker\n\n### CLI\n\n```\nomp auth-broker serve [--bind=host:port] # boot the broker\nomp auth-broker token [--regenerate] [--json] # print or rotate the bearer token\nomp auth-broker login [--via=user@host] [--dry-run]\nomp auth-broker logout \nomp auth-broker import [--provider=] [--include-disabled] [--dry-run] [--json]\nomp auth-broker migrate --from-local [--dry-run] [--json]\nomp auth-broker status [--json]\n```\n\n- `serve` opens the local SQLite store at `getAgentDbPath()` and binds an HTTP listener (default `127.0.0.1:8765`). On startup a token is ensured at `/auth-broker.token` (mode `0600`, `0700` parent dir). The background refresher refreshes any OAuth credential whose `expires - Date.now() < refreshSkewMs` (default 5 min) every `refreshIntervalMs` (default 60 s).\n- `token` prints the cached bearer or generates a new one. `--regenerate` rotates it.\n- `login ` runs the per-provider OAuth flow locally, or — with `--via=user@host` — `ssh -L :127.0.0.1: user@host omp auth-broker login ` so the OAuth callback hits the local browser but the credential is written on the broker host. Built-in callback ports: `anthropic:54545`, `openai-codex:1455`, `google-gemini-cli:8085`, `google-antigravity:51121`, `gitlab-duo:8080`.\n- `logout ` deletes every credential row for ``.\n- `import ` imports CLIProxyAPI-style JSON credentials into the local SQLite store. Maps `type` field → omp provider (`claude → anthropic`, `codex → openai-codex`, `gemini → google-gemini-cli`, `antigravity → google-antigravity`, `gemini-cli → google-gemini-cli`).\n- `migrate --from-local` walks the local SQLite store + env-derived credentials and idempotently uploads them to the configured broker (`POST /v1/credential`).\n- `status` health-pings the configured remote broker.\n\n### Endpoints\n\n| Method | Path | Auth | Purpose |\n| ------ | ---- | ---- | ------- |\n| `GET` | `/v1/healthz` | none | Liveness + version |\n| `GET` | `/v1/snapshot` | bearer | Redacted snapshot (refresh tokens replaced by sentinel) |\n| `POST` | `/v1/credential` | bearer | Upsert one OAuth or API-key credential |\n| `POST` | `/v1/credential/:id/refresh` | bearer | Force-refresh one OAuth credential |\n| `POST` | `/v1/credential/:id/disable` | bearer | Disable one credential with a recorded cause |\n| `GET` | `/v1/usage` | bearer | Aggregate `UsageReport[]` across credentials |\n\nRequests use `Authorization: Bearer `. The server compares against an in-memory token allow-list; the gateway’s implementation uses a timing-safe comparison.\n\n### Background refresher\n\n`AuthBrokerRefresher` iterates active OAuth credentials at `refreshIntervalMs` cadence and refreshes any within `refreshSkewMs` of expiry. Refreshes are single-flighted per credential id so a slow refresh cannot be retriggered. The refresher distinguishes:\n\n- **definitive failures** (`invalid_grant`, `invalid_token`, `revoked`, unauthorized refresh-token, 401/403 not from a network blip) — credentials are passed to `AuthStorage.disableCredentialById(id, cause)` so the next snapshot pull surfaces a clean delete on the client;\n- **transient failures** (timeout / ECONNREFUSED / fetch failed) — left in place for the next sweep.\n\n## auth-gateway\n\n### CLI\n\n```\nomp auth-gateway serve [--bind=host:port] [--no-auth]\nomp auth-gateway token [--regenerate] [--json]\nomp auth-gateway status [--json]\n```\n\n- `serve` requires `OMP_AUTH_BROKER_URL` (or `auth.broker.url` in `config.yml`) — the gateway is itself a broker client. It calls `AuthBrokerClient.fetchSnapshot()`, wraps it in `RemoteAuthCredentialStore`, and constructs an `AuthStorage` that resolves access tokens through the broker. Default bind is `127.0.0.1:4000`. The gateway token is stored at `/auth-gateway.token` (`0600`); `--no-auth` disables the bearer check entirely (loopback-only use).\n- `token` / `status` mirror the broker’s equivalents.\n\n### Endpoints\n\n| Method | Path | Auth | Purpose |\n| ------ | ---- | ---- | ------- |\n| `GET` | `/healthz` | none | Liveness + version |\n| `GET` | `/v1/usage` | bearer | Aggregate `UsageReport[]` (proxied through `AuthStorage`) |\n| `GET` | `/v1/models` | bearer | Bundled-model catalog filtered to providers with credentials |\n| `POST` | `/v1/chat/completions` | bearer | OpenAI Chat Completions wire format |\n| `POST` | `/v1/messages` | bearer | Anthropic Messages wire format |\n| `POST` | `/v1/responses` | bearer | OpenAI Responses wire format |\n\nThe model id is read from the top-level `model` field. The gateway picks the first bundled `Model` matching that id and:\n\n- **Passthrough fast-path** — when the inbound wire format matches the model’s native API (`openai-chat → openai-completions`, `anthropic-messages → anthropic-messages`, `openai-responses → openai-responses`), the request body is forwarded byte-for-byte with the client `Authorization`/`x-api-key` stripped and replaced by `Authorization: Bearer `. Provider-specific fields (`cache_control`, `service_tier`, tool-choice extensions, …) flow through unmodified. Hop-by-hop headers (RFC 7230) plus `Content-Encoding`/`Content-Length` are stripped from the upstream response.\n- **Translate path** — when the inbound format and the resolved model’s API differ (e.g. `/v1/chat/completions` targeting an Anthropic model, or `/v1/responses` targeting `openai-codex-responses` which runs over a websocket transport), the request is parsed against the wire schema, rebuilt into an omp `Context`, dispatched through `streamSimple()`, and re-encoded back to the inbound format (SSE for streamed responses).\n\n`idleTimeout` on the underlying `Bun.serve` is set to `255 s` so long thinking-budget calls do not get killed by Bun’s default idle timeout.\n\n## Usage cache: server-side 5-min jitter + client-side 15 s single-flight\n\nTwo layers cache the aggregate provider-usage report. Both are intentional and stacked.\n\n### Server-side cache (broker `AuthStorage`)\n\n`AuthStorage` caches each credential’s `UsageReport` in the broker’s SQLite store at a **5-minute per-credential TTL with ±25 % jitter**. Anthropic and OpenAI rate-limit `/usage` aggressively per source IP, and a synchronized 5-credential fan-out trips 429s every cycle; the jitter decorrelates refresh times within a few cycles. On fetch failure the store keeps the **last-good** report for up to 24 h with a short jittered re-poll window — so a transient upstream blip never blanks out the widget.\n\nConstants: `USAGE_REPORT_TTL_MS = 5 * 60_000`, `USAGE_LAST_GOOD_RETENTION_MS = 24 * 60 * 60_000` (`packages/ai/src/auth-storage.ts`).\n\n### Client-side single-flight (`RemoteAuthCredentialStore`)\n\nWhen the gateway (or any other broker client) calls `fetchUsageReports()` / `getUsageReport(provider, credential)`, `RemoteAuthCredentialStore` coalesces concurrent calls into a single `GET /v1/usage` round-trip and caches the result for **15 s** in memory.\n\n- `USAGE_CACHE_TTL_MS = 15_000` (`packages/ai/src/auth-broker/remote-store.ts`).\n- A single `#usageInflight` promise is shared across all callers; a per-caller `AbortSignal` is **raced** against the shared promise, not threaded into it, so one caller’s abort never cascades into a peer’s in-flight request.\n- On fetch failure the rejected promise is logged and the awaited value is `null` — callers (`AuthStorage.fetchUsageReports`, `#getUsageReport`) treat a `null` report as \"no usage signal for this cycle\" and proceed without it. **This is the 15 s TTL fallback**: the client absorbs transient broker outages by suppressing the error, returning `null` to ranking, and re-attempting after the 15 s window.\n\nThe 15 s client window deliberately sits below the broker’s 5 min server cache, so almost every client poll is served from the broker’s already-cached value; the client cache exists to absorb the parallel fan-out generated by `AuthStorage.#rankOAuthSelections` into a single broker round-trip.\n\n## Operator opt-in\n\nThe broker is **off** unless `OMP_AUTH_BROKER_URL` (or `auth.broker.url` in `config.yml`) is set. When set, `discoverAuthStorage` in `packages/coding-agent/src/sdk.ts` swaps the local SQLite credential store for `RemoteAuthCredentialStore` and every API call resolves credentials through the broker.\n\n### Environment variables\n\n| Variable | Purpose | Required when |\n| -------- | ------- | ------------- |\n| `OMP_AUTH_BROKER_URL` | Base URL of the remote auth-broker (e.g. `https://broker.tailnet:8765`). Selecting this puts the client in broker mode — local SQLite is bypassed. | Any time the omp client should resolve credentials through a broker (and required by `omp auth-gateway serve`). |\n| `OMP_AUTH_BROKER_TOKEN` | Bearer token used for every broker endpoint except `/v1/healthz`. | When `OMP_AUTH_BROKER_URL` is set and no token is available from `auth.broker.token` or `/auth-broker.token`. |\n\nResolution order in `resolveAuthBrokerConfig()`:\n\n1. `OMP_AUTH_BROKER_URL` env (else `auth.broker.url` from `config.yml`, with `$ENV_NAME` resolution);\n2. `OMP_AUTH_BROKER_TOKEN` env (else `auth.broker.token` from `config.yml`, else `/auth-broker.token`);\n3. URL set but no token resolvable → hard error pointing at the token file path.\n\nThe gateway has no dedicated env vars — it inherits `OMP_AUTH_BROKER_*` because it is itself a broker client.\n\n### `config.yml` keys\n\n| Key | Default | Purpose |\n| --- | ------- | ------- |\n| `auth.broker.url` | unset | Same as `OMP_AUTH_BROKER_URL`; env wins. Hidden from the settings UI. |\n| `auth.broker.token` | unset | Same as `OMP_AUTH_BROKER_TOKEN`; env wins. Values may be the literal token or `$ENV_NAME` to indirect through env. |\n\n### Token files\n\n| Path | Owner | Mode |\n| ---- | ----- | ---- |\n| `/auth-broker.token` | `omp auth-broker serve` (created at first start) | `0600` in a `0700` parent dir |\n| `/auth-gateway.token` | `omp auth-gateway serve` (skipped under `--no-auth`) | `0600` in a `0700` parent dir |\n\n`` resolves to `~/.omp/` (respecting `PI_CONFIG_DIR`).\n\n## Interaction with the local API-key resolution order\n\nThe broker only owns OAuth credentials and provider-API-key credentials that were uploaded to it. The standard credential ladder in `models.md` (`Auth and API key resolution order`) is preserved, with one addition committed alongside the gateway:\n\n- `AuthStorage.setConfigApiKey / removeConfigApiKey / clearConfigApiKeys` let a `models.yml` `apiKey` beat a stored OAuth token **without** overriding an explicit `--api-key`. This is what allows a broker-resolved OAuth credential to be reliably shadowed by a per-environment `models.yml` config key when both are present.\n\n## See also\n\n- [`secrets.md`](./secrets.md) — secret obfuscation around tokens that *do* leak through (e.g. `OMP_AUTH_BROKER_TOKEN` in shell output).\n- [`models.md`](./models.md) — provider auth resolution order; the broker plugs in at layers 2–3 (stored credentials).\n- [`environment-variables.md`](./environment-variables.md) — full env reference including `OMP_AUTH_BROKER_URL` / `OMP_AUTH_BROKER_TOKEN`.\n", "bash-tool-runtime.md": "# Bash tool runtime\n\nThis document describes the **`bash` tool** runtime path used by agent tool calls, from command normalization to execution, truncation/artifacts, and rendering.\n\nIt also calls out where behavior diverges in interactive TUI, print mode, RPC mode, and user-initiated bang (`!`) shell execution.\n\n## Scope and runtime surfaces\n\nThere are two different bash execution surfaces in coding-agent:\n\n1. **Tool-call surface** (`toolName: \"bash\"`): used when the model calls the bash tool.\n - Entry point: `BashTool.execute()`.\n - Parameters include `command`, optional `env`, `timeout`, `cwd`, `head`, `tail`, `pty`, and, when `async.enabled` is true, `async`.\n2. **User bang-command surface** (`!cmd` from interactive input or RPC `bash` command): session-level helper path.\n - Entry point: `AgentSession.executeBash()`.\n\nBoth eventually use `executeBash()` in `src/exec/bash-executor.ts` for non-PTY execution, but only the tool-call path runs normalization/interception, optional managed background-job handling, and tool renderer logic.\n\n## End-to-end tool-call pipeline\n\n## 1) Input handling and parameter merge\n\n`BashTool.execute()` currently handles input before execution as follows:\n\n- validates optional `env` names against shell-variable syntax,\n- extracts a leading `cd && ...` into `cwd` when `cwd` was not supplied,\n- rejects `async: true` when `async.enabled` is false,\n- uses only explicit `head`/`tail` tool args for post-run filtering.\n\n`normalizeBashCommand()` still exists in `src/tools/bash-normalize.ts`, but `BashTool.execute()` does not call it in the current source. Trailing shell pipes such as `| head -n 50` remain part of the shell command unless the caller uses the structured `head`/`tail` args.\n\n## 2) Optional interception (blocked-command path)\n\nIf `bashInterceptor.enabled` is true, `BashTool` loads rules from settings and runs `checkBashInterception()` against the normalized command.\n\nInterception behavior:\n\n- command is blocked **only** when:\n - regex rule matches, and\n - the suggested tool is present in `ctx.toolNames`.\n- invalid regex rules are silently skipped.\n- on block, `BashTool` throws `ToolError` with message:\n - `Blocked: ...`\n - original command included.\n\nDefault rule patterns (defined in code) target common misuses:\n\n- file readers (`cat`, `head`, `tail`, ...)\n- search tools (`grep`, `rg`, ...)\n- file finders (`find`, `fd`, ...)\n- in-place editors (`sed -i`, `perl -i`, `awk -i inplace`)\n- shell redirection writes (`echo ... > file`, heredoc redirection)\n\n### Caveat\n\n`InterceptionResult` includes `suggestedTool`, but `BashTool` currently surfaces only the message text (no structured suggested-tool field in `details`).\n\n## 3) CWD validation and timeout clamping\n\n`cwd` is resolved relative to session cwd (`resolveToCwd`), then validated via `stat`:\n\n- missing path -> `ToolError(\"Working directory does not exist: ...\")`\n- non-directory -> `ToolError(\"Working directory is not a directory: ...\")`\n\nTimeout is clamped to `[1, 3600]` seconds and converted to milliseconds.\n\n## 4) Artifact allocation\n\nBefore execution, the tool allocates an artifact path/id (best-effort) for truncated output storage.\n\n- artifact allocation failure is non-fatal (execution continues without artifact spill file),\n- artifact id/path are passed into execution path for full-output persistence on truncation.\n\n## 5) PTY vs non-PTY execution selection\n\n`BashTool` chooses PTY execution only when all are true:\n\n- tool input `pty === true`\n- `PI_NO_PTY !== \"1\"`\n- tool context has UI (`ctx.hasUI === true` and `ctx.ui` set)\n\nOtherwise it uses non-interactive `executeBash()`.\n\nThat means print mode and non-UI RPC/tool contexts always use non-PTY.\n\n## Non-interactive execution engine (`executeBash`)\n\n## Shell session reuse model\n\n`executeBash()` caches native `Shell` instances in a process-global map keyed by:\n\n- shell path,\n- configured command prefix,\n- snapshot path,\n- serialized shell env,\n- optional agent session key.\n\nSession-level bang-command executions pass `sessionKey: this.sessionId`.\n\nTool-call executions pass `sessionKey: this.session.getSessionId?.()`, when available. In both surfaces, a session key isolates shell reuse per session; without one, reuse falls back to shell config/snapshot/env.\n\n## Shell config and snapshot behavior\n\nAt each call, executor loads settings shell config (`shell`, `env`, optional `prefix`).\n\nIf selected shell includes `bash`, it attempts `getOrCreateSnapshot()`:\n\n- snapshot captures aliases/functions/options from user rc,\n- snapshot creation is best-effort,\n- failure falls back to no snapshot.\n\nIf `prefix` is configured, command becomes:\n\n```text\n \n```\n\n## Streaming and cancellation\n\n`Shell.run()` streams chunks to `OutputSink` and optional `onChunk` callback.\n\nCancellation:\n\n- aborted signal triggers `shellSession.abort(...)`,\n- timeout from native result is mapped to `cancelled: true` + annotation text,\n- explicit cancellation similarly returns `cancelled: true` + annotation.\n\nNo exception is thrown inside executor for timeout/cancel; it returns structured `BashResult` and lets caller map error semantics.\n\n## Interactive PTY path (`runInteractiveBashPty`)\n\nWhen PTY is enabled, tool runs `runInteractiveBashPty()` which opens an overlay console component and drives a native `PtySession`.\n\nBehavior highlights:\n\n- xterm-headless virtual terminal renders viewport in overlay,\n- keyboard input is normalized (including Kitty sequences and application cursor mode handling),\n- `esc` while running kills the PTY session,\n- terminal resize propagates to PTY (`session.resize(cols, rows)`).\n\nEnvironment hardening defaults are injected for unattended runs:\n\n- pagers disabled (`PAGER=cat`, `GIT_PAGER=cat`, etc.),\n- editor prompts disabled (`GIT_EDITOR=true`, `EDITOR=true`, ...),\n- terminal/auth prompts reduced (`GIT_TERMINAL_PROMPT=0`, `SSH_ASKPASS=/usr/bin/false`, `CI=1`),\n- package-manager/tool automation flags for non-interactive behavior.\n\nPTY output is normalized (`CRLF`/`CR` to `LF`, `sanitizeText`) and written into `OutputSink`, including artifact spill support.\n\nOn PTY startup/runtime error, sink receives `PTY error: ...` line and command finalizes with undefined exit code.\n\n## Output handling: streaming, truncation, artifact spill\n\nBoth PTY and non-PTY paths use `OutputSink`.\n\n## OutputSink semantics\n\n- keeps an in-memory UTF-8-safe tail buffer (`DEFAULT_MAX_BYTES`, currently 50KB),\n- tracks total bytes/lines seen,\n- if artifact path exists and output overflows (or file already active), writes full stream to artifact file,\n- when memory threshold overflows, trims in-memory buffer to tail (UTF-8 boundary safe),\n- marks `truncated` when overflow/file spill occurs.\n\n`dump()` returns:\n\n- `output` (possibly annotated prefix),\n- `truncated`,\n- `totalLines/totalBytes`,\n- `outputLines/outputBytes`,\n- `artifactId` if artifact file was active.\n\n### Long-output caveat\n\nRuntime truncation is byte-threshold based in `OutputSink` (50KB default). It does not enforce a hard 2000-line cap in this code path.\n\n## Live tool updates and async jobs\n\nFor non-PTY foreground execution, `BashTool` uses a separate `TailBuffer` for partial updates and emits `onUpdate` snapshots while command is running.\n\nFor PTY execution, live rendering is handled by custom UI overlay, not by `onUpdate` text chunks.\n\nWhen `async.enabled` is true and the call passes `async: true`, `BashTool` starts a managed bash job, returns a running job result with a job id, and stores completion through the session managed-job path. Auto-backgrounding can also start this path after `bash.autoBackground.thresholdMs`.\n\n## Result shaping, metadata, and error mapping\n\nAfter execution:\n\n1. `cancelled` handling:\n - if abort signal is aborted -> throw `ToolAbortError` (abort semantics),\n - else -> throw `ToolError` (treated as tool failure).\n2. PTY `timedOut` -> throw `ToolError`.\n3. apply head/tail filters to final output text (`applyHeadTail`, head then tail).\n4. empty output becomes `(no output)`.\n5. attach truncation metadata via `toolResult(...).truncationFromSummary(result, { direction: \"tail\" })`.\n6. exit-code mapping:\n - missing exit code -> `ToolError(\"... missing exit status\")`\n - non-zero exit -> `ToolError(\"... Command exited with code N\")`\n - zero exit -> success result.\n\nSuccess payload structure:\n\n- `content`: text output,\n- `details.meta.truncation` when truncated, including:\n - `direction`, `truncatedBy`, total/output line+byte counts,\n - `shownRange`,\n - `artifactId` when available.\n\nBecause built-in tools are wrapped with `wrapToolWithMetaNotice()`, truncation notice text is appended to final text content automatically (for example: `Full: artifact://`).\n\n## Rendering paths\n\n## Tool-call renderer (`bashToolRenderer`)\n\n`bashToolRenderer` is used for tool-call messages (`toolCall` / `toolResult`):\n\n- collapsed mode shows visual-line-truncated preview,\n- expanded mode shows all currently available output text,\n- warning line includes truncation reason and `artifact://` when truncated,\n- timeout value (from args) is shown in footer metadata line.\n\n### Caveat: full artifact expansion\n\n`BashRenderContext` has `isFullOutput`, but current renderer context builder does not set it for bash tool results. Expanded view still uses the text already in result content (tail/truncated output) unless another caller provides full artifact content.\n\n## User bang-command component (`BashExecutionComponent`)\n\n`BashExecutionComponent` is for user `!` commands in interactive mode (not model tool calls):\n\n- streams chunks live,\n- collapsed preview keeps last 20 logical lines,\n- line clamp at 4000 chars per line,\n- shows truncation + artifact warnings when metadata is present,\n- marks cancelled/error/exit state separately.\n\nThis component is wired by `CommandController.handleBashCommand()` and fed from `AgentSession.executeBash()`.\n\n## Mode-specific behavior differences\n\n| Surface | Entry path | PTY eligible | Live output UX | Error surfacing |\n| ------------------------------ | ----------------------------------------------------- | -------------------------------------------------------------------- | ------------------------------------------------------------------------ | ------------------------------------------------ |\n| Interactive tool call | `BashTool.execute` | Yes, when `pty=true` and UI exists and `PI_NO_PTY!=1` | PTY overlay (interactive) or streamed tail updates | Tool errors become `toolResult.isError` |\n| Print mode tool call | `BashTool.execute` | No (no UI context) | No TUI overlay; output appears in event stream/final assistant text flow | Same tool error mapping |\n| RPC tool call (agent tooling) | `BashTool.execute` | Usually no UI -> non-PTY | Structured tool events/results | Same tool error mapping |\n| Interactive bang command (`!`) | `AgentSession.executeBash` + `BashExecutionComponent` | No (uses executor directly) | Dedicated bash execution component | Controller catches exceptions and shows UI error |\n| RPC `bash` command | `rpc-mode` -> `session.executeBash` | No | Returns `BashResult` directly | Consumer handles returned fields |\n\n## Operational caveats\n\n- Interceptor only blocks commands when suggested tool is currently available in context.\n- If artifact allocation fails, truncation still occurs but no `artifact://` back-reference is available.\n- Shell session cache has no explicit eviction in this module; lifetime is process-scoped.\n- PTY and non-PTY timeout surfaces differ:\n - PTY exposes explicit `timedOut` result field,\n - non-PTY maps timeout into `cancelled + annotation` summary.\n\n## Implementation files\n\n- [`src/tools/bash.ts`](../packages/coding-agent/src/tools/bash.ts) — tool entrypoint, input handling/interception, async and PTY/non-PTY selection, result/error mapping, bash tool renderer.\n- [`src/tools/bash-normalize.ts`](../packages/coding-agent/src/tools/bash-normalize.ts) — post-run head/tail filtering; also contains an unused command-normalization helper.\n- [`src/tools/bash-interceptor.ts`](../packages/coding-agent/src/tools/bash-interceptor.ts) — interceptor rule matching and blocked-command messages.\n- [`src/exec/bash-executor.ts`](../packages/coding-agent/src/exec/bash-executor.ts) — non-PTY executor, shell session reuse, cancellation wiring, output sink integration.\n- [`src/tools/bash-interactive.ts`](../packages/coding-agent/src/tools/bash-interactive.ts) — PTY runtime, overlay UI, input normalization, non-interactive env defaults.\n- [`src/session/streaming-output.ts`](../packages/coding-agent/src/session/streaming-output.ts) — `OutputSink`, `TailBuffer`, truncation/artifact spill, and summary metadata.\n- [`src/tools/output-meta.ts`](../packages/coding-agent/src/tools/output-meta.ts) — truncation metadata shape + notice injection wrapper.\n- [`src/session/agent-session.ts`](../packages/coding-agent/src/session/agent-session.ts) — session-level `executeBash`, message recording, abort lifecycle.\n- [`src/modes/components/bash-execution.ts`](../packages/coding-agent/src/modes/components/bash-execution.ts) — interactive `!` command execution component.\n- [`src/modes/controllers/command-controller.ts`](../packages/coding-agent/src/modes/controllers/command-controller.ts) — wiring for interactive `!` command UI stream/update completion.\n- [`src/modes/rpc/rpc-mode.ts`](../packages/coding-agent/src/modes/rpc/rpc-mode.ts) — RPC `bash` and `abort_bash` command surface.\n- [`src/internal-urls/artifact-protocol.ts`](../packages/coding-agent/src/internal-urls/artifact-protocol.ts) — `artifact://` resolution.\n", "blob-artifact-architecture.md": "# Blob and artifact storage architecture\n\nThis document describes how coding-agent stores large/binary payloads outside session JSONL, how truncated tool output is persisted, and how internal URLs (`artifact://`, `agent://`) resolve back to stored data.\n\n## Why two storage systems exist\n\nThe runtime uses two different persistence mechanisms for different data shapes:\n\n- **Content-addressed blobs** (`blob:sha256:`): global storage used to externalize large image base64 payloads and provider image data URLs from persisted session entries.\n- **Session-scoped artifacts** (files under `/`): per-session text files used for full tool outputs and subagent outputs.\n\nThey are intentionally separate:\n\n- blob storage optimizes deduplication and stable references by content hash,\n- artifact storage optimizes append-only session tooling and human/tool retrieval by local IDs.\n\n## Storage boundaries and on-disk layout\n\n## Blob store boundary (global)\n\n`SessionManager` constructs `BlobStore(getBlobsDir())`, so blob files live in a shared global blob directory (not in a session folder).\n\nBlob file naming:\n\n- file path: `/`\n- no extension\n- reference string stored in entries: `blob:sha256:`\n\nImplications:\n\n- same binary content across sessions resolves to the same hash/path,\n- writes are idempotent at the content level,\n- blobs can outlive any individual session file.\n\n## Artifact boundary (session-local)\n\n`ArtifactManager` derives artifact directory from session file path:\n\n- session file: `.../_.jsonl`\n- artifacts directory: `.../_/` (strip `.jsonl`)\n\nArtifact types share this directory:\n\n- truncated tool output files: `..log` (for `artifact://`)\n- subagent output files: `.md` (for `agent://`)\n\n## ID and name allocation schemes\n\n## Blob IDs: content hash\n\n`BlobStore.put()` computes SHA-256 over the bytes it is given and returns:\n\n- `hash`: hex digest,\n- `path`: `/`,\n- `ref`: `blob:sha256:`.\n\nNo session-local counter is used.\n\n## Artifact IDs: session-local monotonic integer\n\n`ArtifactManager` scans existing `*.log` artifact files on first use to find max existing numeric ID and sets `nextId = max + 1`.\n\nAllocation behavior:\n\n- file format: `{id}.{toolType}.log`\n- IDs are sequential strings (`\"0\"`, `\"1\"`, ...)\n- resume does not overwrite existing artifacts because scan happens before allocation.\n\nIf artifact directory is missing, scanning yields empty list and allocation starts from `0`.\n\n## Agent output IDs (`agent://`)\n\n`AgentOutputManager` allocates IDs for subagent outputs as `-` (optionally nested under parent prefix, e.g. `0-Parent.1-Child`). It scans existing `.md` files on initialization to continue from the next index on resume.\n\n## Persistence dataflow\n\n## 1) Session entry persistence rewrite path\n\nBefore session entries are written (`#rewriteFile` / incremental persist), `SessionManager` calls `prepareEntryForPersistence()` (via `truncateForPersistence`).\n\nKey behaviors:\n\n1. **Large string truncation**: oversized strings are cut and suffixed with `\"[Session persistence truncated large content]\"`; signature fields (`thinkingSignature`, `thoughtSignature`, `textSignature`) are cleared instead of truncated.\n2. **Transient field stripping**: `partialJson` and `jsonlEvents` are removed from persisted entries.\n3. **Image externalization to blobs**:\n - image blocks in `content` arrays are externalized when `data` is not already a blob ref and base64 length is at least threshold (`BLOB_EXTERNALIZE_THRESHOLD = 1024`),\n - provider-style `image_url` data URLs are externalized when they start with `data:image/` and contain `;base64,`,\n - image block `data` is stored as decoded binary bytes,\n - provider data URLs are stored as the original UTF-8 data URL string,\n - persisted values are replaced with `blob:sha256:`.\n\nThis keeps session JSONL compact while preserving recoverability.\n\n## 2) Session load rehydration path\n\nWhen opening a session (`setSessionFile`), after migrations, `SessionManager` runs `resolveBlobRefsInEntries()`.\n\nFor message/custom-message image blocks with `blob:sha256:` and for persisted provider `image_url` fields with blob refs:\n\n- reads blob bytes from blob store,\n- converts image-block bytes back to base64,\n- converts provider `image_url` blobs back to the original string,\n- mutates in-memory entry fields for runtime consumers.\n\nIf blob is missing:\n\n- `resolveImageData()` logs warning,\n- returns original ref string unchanged,\n- load continues (no hard crash).\n\n## 3) Tool output spill/truncation path\n\n`OutputSink` powers streaming output in bash/python/ssh and related executors.\n\nBehavior:\n\n1. Every chunk is sanitized and appended to in-memory tail buffer.\n2. When in-memory bytes exceed spill threshold (`DEFAULT_MAX_BYTES`, 50KB), sink marks output truncated.\n3. If an artifact path is available, sink opens a file writer and writes:\n - existing buffered content once,\n - all subsequent chunks.\n4. In-memory buffer is always trimmed to tail window for display.\n5. `dump()` returns summary including `artifactId` only when file sink was successfully created.\n\nPractical effect:\n\n- UI/tool return shows truncated tail,\n- full output is preserved in artifact file and referenced as `artifact://`.\n\nIf file sink creation fails (I/O error, missing path, etc.), sink silently falls back to in-memory truncation only; full output is not persisted.\n\n## URL access model\n\n## `blob:` references\n\n`blob:sha256:` is a persistence reference inside session entry payloads, not an internal URL scheme handled by the router. Resolution is done by `SessionManager` during session load.\n\n## `artifact://`\n\nHandled by `ArtifactProtocolHandler`:\n\n- requires active session artifact directory,\n- ID must be numeric,\n- resolves by matching filename prefix `.`,\n- returns raw text (`text/plain`) from the matched `.log` file,\n- when missing, error includes list of available artifact IDs.\n\nMissing directory behavior:\n\n- if artifacts directory does not exist, throws `No artifacts directory found`.\n\n## `agent://`\n\nHandled by `AgentProtocolHandler` over `/.md`:\n\n- plain form returns markdown text,\n- `/path` or `?q=` forms perform JSON extraction,\n- path and query extraction cannot be combined,\n- if extraction requested, file content must parse as JSON.\n\nMissing directory behavior:\n\n- throws `No artifacts directory found`.\n\nMissing output behavior:\n\n- throws `Not found: ` with available IDs from existing `.md` files.\n\nRead tool integration:\n\n- `read` supports offset/limit pagination for non-extraction internal URL reads,\n- rejects `offset/limit` when `agent://` extraction is used.\n\n## Resume, fork, and move semantics\n\n## Resume\n\n- `ArtifactManager` scans existing `{id}.*.log` files on first allocation and continues numbering.\n- `AgentOutputManager` scans existing `.md` output IDs and continues numbering.\n- `SessionManager` rehydrates blob refs to base64 on load.\n\n## Fork\n\n`SessionManager.fork()` creates a new session file with new session ID and `parentSession` link, then returns old/new file paths. Artifact copying is handled by `AgentSession.fork()`:\n\n- attempts recursive copy of old artifact directory to new artifact directory,\n- missing old directory is tolerated,\n- non-ENOENT copy errors are logged as warnings and fork still completes.\n\nID implications after fork:\n\n- if copy succeeded, artifact counters in new session continue after max copied ID,\n- if copy failed/skipped, new session artifact IDs start from `0`.\n\nBlob implications after fork:\n\n- blobs are global and content-addressed, so no blob directory copy is required.\n\n## Move to new cwd\n\n`SessionManager.moveTo()` renames both session file and artifact directory to the new default session directory, with rollback logic if a later step fails. This preserves artifact identity while relocating session scope.\n\n## Failure handling and fallback paths\n\n| Case | Behavior |\n| -------------------------------------------------------- | --------------------------------------------------------------------- |\n| Blob file missing during rehydration | Warn and keep `blob:sha256:` ref string in-memory |\n| Blob read ENOENT via `BlobStore.get` | Returns `null` |\n| Artifact directory missing (`ArtifactManager.listFiles`) | Returns empty list (allocation can start fresh) |\n| Artifact directory missing (`artifact://` / `agent://`) | Throws explicit `No artifacts directory found` |\n| Artifact ID not found | Throws with available IDs listing |\n| OutputSink artifact writer init fails | Continues with tail-only truncation (no full-output artifact) |\n| No session file (some task paths) | Task tool falls back to temp artifacts directory for subagent outputs |\n\n## Binary blob externalization vs text-output artifacts\n\n- **Blob externalization** is for image payloads inside persisted session entry content and provider image data URLs; it replaces inline payload strings in JSONL with stable content refs.\n- **Artifacts** are plain text files for execution output and subagent output; they are addressable by session-local IDs through internal URLs.\n\nThe two systems intersect only indirectly (both reduce session JSONL bloat) but have different identity, lifetime, and retrieval paths.\n\n## Implementation files\n\n- [`src/session/blob-store.ts`](../packages/coding-agent/src/session/blob-store.ts) — blob reference format, hashing, put/get, externalize/resolve helpers.\n- [`src/session/artifacts.ts`](../packages/coding-agent/src/session/artifacts.ts) — session artifact directory model and numeric artifact ID/path allocation.\n- [`src/session/streaming-output.ts`](../packages/coding-agent/src/session/streaming-output.ts) — `OutputSink` truncation/spill-to-file behavior and summary metadata.\n- [`src/session/session-manager.ts`](../packages/coding-agent/src/session/session-manager.ts) — persistence transforms, blob rehydration on load, session fork/move interactions.\n- [`src/session/agent-session.ts`](../packages/coding-agent/src/session/agent-session.ts) — artifact directory copy during interactive fork.\n- [`src/internal-urls/artifact-protocol.ts`](../packages/coding-agent/src/internal-urls/artifact-protocol.ts) — `artifact://` resolver.\n- [`src/internal-urls/agent-protocol.ts`](../packages/coding-agent/src/internal-urls/agent-protocol.ts) — `agent://` resolver + JSON extraction.\n- [`src/sdk.ts`](../packages/coding-agent/src/sdk.ts) — internal URL router wiring and artifacts-dir resolver.\n- [`src/task/output-manager.ts`](../packages/coding-agent/src/task/output-manager.ts) — session-scoped agent output ID allocation for `agent://`.\n- [`src/task/executor.ts`](../packages/coding-agent/src/task/executor.ts) — subagent output artifact writes (`.md`) and temp artifact directory fallback.\n", "compaction.md": "# Compaction and Branch Summaries\n\nCompaction and branch summaries are the two mechanisms that keep long sessions usable without losing prior work context.\n\n- **Compaction** rewrites old history into a summary on the current branch.\n- **Branch summary** captures abandoned branch context during `/tree` navigation.\n\nBoth are persisted as session entries and converted back into user-context messages when rebuilding LLM input.\n\n## Key implementation files\n\n- `packages/agent/src/compaction/compaction.ts` (context-full summarization and handoff generation)\n- `packages/agent/src/compaction/branch-summarization.ts`\n- `packages/agent/src/compaction/pruning.ts`\n- `packages/agent/src/compaction/utils.ts`\n- `packages/agent/src/compaction/openai.ts`\n- `packages/coding-agent/src/session/session-manager.ts`\n- `packages/coding-agent/src/session/agent-session.ts`\n- `packages/coding-agent/src/session/messages.ts`\n- `packages/coding-agent/src/extensibility/hooks/types.ts`\n- `packages/coding-agent/src/config/settings-schema.ts`\n\n## Session entry model\n\nCompaction and branch summaries are first-class session entries, not plain assistant/user messages.\n\n- `CompactionEntry`\n - `type: \"compaction\"`\n - `summary`, optional `shortSummary`\n - `firstKeptEntryId` (compaction boundary)\n - `tokensBefore`\n - optional `details`, `preserveData`, `fromExtension`\n- `BranchSummaryEntry`\n - `type: \"branch_summary\"`\n - `fromId`, `summary`\n - optional `details`, `fromExtension`\n\nWhen context is rebuilt (`buildSessionContext`):\n\n1. Latest compaction on the active path is converted to one `compactionSummary` message.\n2. Kept entries from `firstKeptEntryId` to the compaction point are re-included.\n3. Later entries on the path are appended.\n4. `branch_summary` entries are converted to `branchSummary` messages.\n5. `custom_message` entries are converted to `custom` messages.\n\nThose custom roles are then transformed into LLM-facing user messages in `convertToLlm()` using the static templates:\n\n- `packages/agent/src/compaction/prompts/compaction-summary-context.md`\n- `packages/agent/src/compaction/prompts/branch-summary-context.md`\n- `packages/agent/src/compaction/prompts/handoff-document.md`\n\n## Compaction pipeline\n\n### Triggers\n\nCompaction/context maintenance can run in four ways:\n\n1. **Manual context compaction**: `/compact [instructions]` calls `AgentSession.compact(...)`.\n2. **Automatic overflow recovery**: after a same-model assistant error that matches context overflow.\n3. **Automatic threshold maintenance**: after a successful turn when context exceeds the resolved threshold.\n4. **Idle maintenance**: `runIdleCompaction()` can invoke the same auto-maintenance path with reason `\"idle\"`.\n\n### Compaction shape (visual)\n\n```text\nBefore compaction:\n\n entry: 0 1 2 3 4 5 6 7 8 9\n ┌─────┬─────┬─────┬──────┬─────┬─────┬──────┬──────┬─────┬──────┐\n │ hdr │ usr │ ass │ tool │ usr │ ass │ tool │ tool │ ass │ tool │\n └─────┴─────┴─────┴──────┴─────┴─────┴──────┴──────┴─────┴──────┘\n └────────┬───────┘ └──────────────┬──────────────┘\n messagesToSummarize kept messages\n ↑\n firstKeptEntryId (entry 4)\n\nAfter compaction (new entry appended):\n\n entry: 0 1 2 3 4 5 6 7 8 9 10\n ┌─────┬─────┬─────┬──────┬─────┬─────┬──────┬──────┬─────┬──────┬─────┐\n │ hdr │ usr │ ass │ tool │ usr │ ass │ tool │ tool │ ass │ tool │ cmp │\n └─────┴─────┴─────┴──────┴─────┴─────┴──────┴──────┴─────┴──────┴─────┘\n └──────────┬──────┘ └──────────────────────┬───────────────────┘\n not sent to LLM sent to LLM\n ↑\n starts from firstKeptEntryId\n\nWhat the LLM sees:\n\n ┌────────┬─────────┬─────┬─────┬──────┬──────┬─────┬──────┐\n │ system │ summary │ usr │ ass │ tool │ tool │ ass │ tool │\n └────────┴─────────┴─────┴─────┴──────┴──────┴─────┴──────┘\n ↑ ↑ └─────────────────┬────────────────┘\n prompt from cmp messages from firstKeptEntryId\n```\n\n### Overflow-retry vs threshold/idle maintenance\n\nThe automatic paths are intentionally different:\n\n- **Overflow recovery**\n - Trigger: current-model assistant error is detected as context overflow and the error is not older than the latest compaction.\n - The failing assistant error message is removed from active agent state before retry.\n - Context promotion is tried first; if a configured larger model is available, the agent switches model and retries without compacting.\n - If promotion is unavailable and compaction is enabled, context-full compaction runs with `reason: \"overflow\"` and `willRetry: true`; handoff strategy is not used for overflow.\n - On success, agent auto-continues (`agent.continue()`) after compaction.\n\n- **Threshold maintenance**\n - Trigger: successful, non-error assistant message whose adjusted context tokens exceed `resolveThresholdTokens(...)`.\n - Tool-output pruning can reduce the measured token count before threshold comparison.\n - Context promotion is tried before compaction.\n - If promotion is unavailable, auto maintenance runs with `reason: \"threshold\"` and `willRetry: false`.\n - With `compaction.strategy: \"handoff\"`, threshold maintenance starts a new handoff session instead of writing a compaction entry; if handoff returns no document without aborting, it falls back to context-full compaction.\n - On success, if `compaction.autoContinue !== false`, schedules an agent-authored developer auto-continue prompt from `prompts/system/auto-continue.md`.\n\n- **Idle maintenance**\n - Trigger: `runIdleCompaction()` when not streaming or already compacting.\n - Uses `reason: \"idle\"` and does not auto-continue afterward.\n\n### Pre-compaction pruning\n\nBefore compaction checks, tool-result pruning may run (`pruneToolOutputs`).\n\nDefault prune policy:\n\n- Protect newest `40_000` tool-output tokens.\n- Require at least `20_000` total estimated savings.\n- Never prune tool results from `skill` or `read`.\n\nPruned tool results are replaced with:\n\n- `[Output truncated - N tokens]`\n\nIf pruning changes entries, session storage is rewritten and agent message state is refreshed before compaction decisions.\n\n### Boundary and cut-point logic\n\n`prepareCompaction()` only considers entries since the last compaction entry (if any).\n\n1. Find previous compaction index.\n2. Compute `boundaryStart = prevCompactionIndex + 1`.\n3. Adapt `keepRecentTokens` using measured usage ratio when available.\n4. Run `findCutPoint()` over the boundary window.\n\nValid cut points include:\n\n- message entries with roles: `user`, `assistant`, `bashExecution`, `hookMessage`, `branchSummary`, `compactionSummary`\n- `custom_message` entries\n- `branch_summary` entries\n\nHard rule: never cut at `toolResult`.\n\nIf there are non-message metadata entries immediately before the cut point (`model_change`, `thinking_level_change`, labels, etc.), they are pulled into the kept region by moving cut index backward until a message or compaction boundary is hit.\n\n### Split-turn handling\n\nIf cut point is not at a user-turn start, compaction treats it as a split turn.\n\nTurn start detection treats these as user-turn boundaries:\n\n- `message.role === \"user\"`\n- `message.role === \"bashExecution\"`\n- `custom_message` entry\n- `branch_summary` entry\n\nSplit-turn compaction generates two summaries:\n\n1. History summary (`messagesToSummarize`)\n2. Turn-prefix summary (`turnPrefixMessages`)\n\nFinal stored summary is merged as:\n\n```markdown\n\n\n---\n\n**Turn Context (split turn):**\n\n\n```\n\n### Summary generation\n\n`compact(...)` builds summaries from serialized conversation text:\n\n1. Convert messages via `convertToLlm()`.\n2. Serialize with `serializeConversation()`.\n3. Wrap in `...`.\n4. Optionally include `...`.\n5. Optionally inject hook context as `` list.\n6. Execute summarization prompt with `SUMMARIZATION_SYSTEM_PROMPT`.\n\nPrompt selection:\n\n- first compaction: `compaction-summary.md`\n- iterative compaction with prior summary: `compaction-update-summary.md`\n- split-turn second pass: `compaction-turn-prefix.md`\n- short UI summary: `compaction-short-summary.md`\n- handoff document: `handoff-document.md` (used by `generateHandoff(...)`, not serialized compaction)\n\nRemote summarization modes:\n\n- If `compaction.remoteEndpoint` is set and remote compaction is enabled, local summary generation POSTs:\n - `{ systemPrompt, prompt }`\n- Expects JSON containing at least `{ summary }`.\n- For OpenAI/OpenAI Codex models, compaction first tries the provider-native `/responses/compact` endpoint when remote compaction is enabled. It preserves provider replacement history in `preserveData.openaiRemoteCompaction` and falls back to local summarization if that native request fails.\n\n### Handoff generation\n\n`packages/agent/src/compaction/compaction.ts` also exports `generateHandoff(...)`. Handoff generation uses the same `completeSimple(...)` oneshot style as summarization, but it preserves the live agent cache prefix by sending the active system prompt, tool array, and real LLM message history, then appending one agent-attributed `user` message containing the handoff prompt. It forces `toolChoice: \"none\"` and returns joined text blocks directly.\n\nHandoff does not write a `CompactionEntry`. `AgentSession.handoff()` owns the session transition: it starts a new session, injects the generated document as a visible `custom_message` with `customType: \"handoff\"`, and rebuilds agent messages from that new session.\n\n### File-operation context in summaries\n\nCompaction tracks cumulative file activity using assistant tool calls:\n\n- `read(path)` → read set\n- `write(path)` → modified set\n- `edit(path)` → modified set\n\nCumulative behavior:\n\n- Includes prior compaction details only when prior entry is pi-generated (`fromExtension !== true`).\n- In split turns, includes turn-prefix file ops too.\n- `readFiles` excludes files also modified.\n\nSummary text gets file tags appended via prompt template:\n\n```xml\n\n...\n\n\n...\n\n```\n\n### Persist and reload\n\nAfter summary generation (or hook-provided summary), agent session:\n\n1. Appends `CompactionEntry` with `appendCompaction(...)` for context-full maintenance; handoff strategy creates a new session and injects a handoff `custom_message` instead.\n2. Rebuilds display context from the active leaf via `buildDisplaySessionContext()`.\n3. Replaces live agent messages with rebuilt context.\n4. Emits `session_compact` hook event.\n\n## Branch summarization pipeline\n\nBranch summarization is tied to tree navigation, not token overflow.\n\n### Trigger\n\nDuring `navigateTree(...)`:\n\n1. Compute abandoned entries from old leaf to common ancestor using `collectEntriesForBranchSummary(...)`.\n2. If caller requested summary (`options.summarize`), generate summary before switching leaf.\n3. If summary exists, attach it at the navigation target using `branchWithSummary(...)`.\n\nOperationally this is commonly driven by `/tree` flow when `branchSummary.enabled` is enabled.\n\n### Branch switch shape (visual)\n\n```text\nTree before navigation:\n\n ┌─ B ─ C ─ D (old leaf, being abandoned)\n A ───┤\n └─ E ─ F (target)\n\nCommon ancestor: A\nEntries to summarize: B, C, D\n\nAfter navigation with summary:\n\n ┌─ B ─ C ─ D ─ [summary of B,C,D]\n A ───┤\n └─ E ─ F (new leaf)\n```\n\n### Preparation and token budget\n\n`generateBranchSummary(...)` computes budget as:\n\n- `tokenBudget = model.contextWindow - branchSummary.reserveTokens`\n\n`prepareBranchEntries(...)` then:\n\n1. First pass: collect cumulative file ops from all summarized entries, including prior pi-generated `branch_summary` details.\n2. Second pass: walk newest → oldest, adding messages until token budget is reached.\n3. Prefer preserving recent context.\n4. May still include large summary entries near budget edge for continuity.\n\nCompaction entries are included as messages (`compactionSummary`) during branch summarization input.\n\n### Summary generation and persistence\n\nBranch summarization:\n\n1. Converts and serializes selected messages.\n2. Wraps in ``.\n3. Uses custom instructions if supplied, otherwise `branch-summary.md`.\n4. Calls summarization model with `SUMMARIZATION_SYSTEM_PROMPT`.\n5. Prepends `branch-summary-preamble.md`.\n6. Appends file-operation tags.\n\nResult is stored as `BranchSummaryEntry` with optional details (`readFiles`, `modifiedFiles`).\n\n## Extension and hook touchpoints\n\n### `session_before_compact`\n\nPre-compaction hook.\n\nCan:\n\n- cancel compaction (`{ cancel: true }`)\n- provide full custom compaction payload (`{ compaction: CompactionResult }`)\n\n### `session.compacting`\n\nPrompt/context customization hook for default compaction.\n\nCan return:\n\n- `prompt` (override base summary prompt)\n- `context` (extra context lines injected into ``)\n- `preserveData` (stored on compaction entry)\n\n### `session_compact`\n\nPost-compaction notification with saved `compactionEntry` and `fromExtension` flag.\n\n### `session_before_tree`\n\nRuns on tree navigation before default branch summary generation.\n\nCan:\n\n- cancel navigation\n- provide custom `{ summary: { summary, details } }` used when user requested summarization\n\n### `session_tree`\n\nPost-navigation event exposing new/old leaf and optional summary entry.\n\n## Runtime behavior and failure semantics\n\n- Manual compaction aborts current agent operation first.\n- `abortCompaction()` cancels both manual and auto-compaction controllers.\n- Auto compaction emits start/end session events for UI/state updates.\n- Auto compaction can try multiple model candidates and retry transient failures; long retry delays prefer the next candidate when one is available.\n- Overflow errors are excluded from generic retry path because they are handled by context promotion/compaction.\n- If auto-compaction fails:\n - overflow path emits `Context overflow recovery failed: ...`\n - threshold path emits `Auto-compaction failed: ...`\n- Branch summarization can be cancelled via abort signal (e.g., Escape), returning canceled/aborted navigation result.\n\n## Settings and defaults\n\nFrom `settings-schema.ts`:\n\n- `compaction.enabled` = `true`\n- `compaction.strategy` = `\"context-full\"` (`\"handoff\"` and `\"off\"` are also supported)\n- `compaction.reserveTokens` = `16384`\n- `compaction.keepRecentTokens` = `20000`\n- `compaction.autoContinue` = `true`\n- `compaction.remoteEnabled` = `true`\n- `compaction.remoteEndpoint` = `undefined`\n- `compaction.thresholdPercent` = `-1` and `compaction.thresholdTokens` = `-1`; when no positive override is set, the threshold is `contextWindow - max(15% of contextWindow, reserveTokens)`\n- `compaction.idleEnabled` = `true`\n- `branchSummary.enabled` = `false`\n- `branchSummary.reserveTokens` = `16384`\n\nThese values are consumed at runtime by `AgentSession` and compaction/branch summarization modules.\n", "config-usage.md": "# Configuration Discovery and Resolution\n\nThis document describes how the coding-agent resolves configuration today: which roots are scanned, how precedence works, and how resolved config is consumed by settings, skills, hooks, tools, and extensions.\n\n## Scope\n\nPrimary implementation:\n\n- `packages/coding-agent/src/config.ts`\n- `packages/coding-agent/src/config/settings.ts`\n- `packages/coding-agent/src/config/settings-schema.ts`\n- `packages/coding-agent/src/discovery/builtin.ts`\n- `packages/coding-agent/src/discovery/helpers.ts`\n\nKey integration points:\n\n- `packages/coding-agent/src/capability/index.ts`\n- `packages/coding-agent/src/discovery/index.ts`\n- `packages/coding-agent/src/extensibility/skills.ts`\n- `packages/coding-agent/src/extensibility/hooks/loader.ts`\n- `packages/coding-agent/src/extensibility/custom-tools/loader.ts`\n- `packages/coding-agent/src/extensibility/extensions/loader.ts`\n\n---\n\n## Resolution flow (visual)\n\n```text\n Generic helper order (`config.ts`)\n┌───────────────────────────────────────┐\n│ 1) ~/.omp/agent, ~/.claude, ... │\n│ 2) /.omp, /.claude, ... │\n└───────────────────────────────────────┘\n │\n ▼\n capability providers enumerate items\n (native provider scans project .omp before user .omp;\n other providers have their own loading rules)\n │\n ▼\n provider priority sort + capability dedup\n │\n ▼\n subsystem-specific consumption\n (settings, skills, hooks, tools, extensions)\n```\n\n## 1) Config roots and source order\n\n## Canonical roots\n\n`src/config.ts` defines a fixed source priority list:\n\n1. `.omp` (native)\n2. `.claude`\n3. `.codex`\n4. `.gemini`\n\nUser-level bases:\n\n- `~/.omp/agent`\n- `~/.claude`\n- `~/.codex`\n- `~/.gemini`\n\nProject-level bases:\n\n- `/.omp`\n- `/.claude`\n- `/.codex`\n- `/.gemini`\n\n`CONFIG_DIR_NAME` is `.omp` (`packages/utils/src/dirs.ts`).\n\n## Important constraint\n\nThe generic helpers in `src/config.ts` do **not** include `.pi` in source discovery order.\n\n---\n\n## 2) Core discovery helpers (`src/config.ts`)\n\n## `getConfigDirs(subpath, options)`\n\nReturns ordered entries:\n\n- User-level entries first (by source priority)\n- Then project-level entries (by same source priority)\n\nOptions:\n\n- `user` (default `true`)\n- `project` (default `true`)\n- `cwd` (default `getProjectDir()`)\n- `existingOnly` (default `false`)\n\nThis API is used for directory-based config lookups (commands, hooks, tools, agents, etc.).\n\n## `findConfigFile(subpath, options)` / `findConfigFileWithMeta(...)`\n\nSearches for the first existing file across ordered bases, returns first match (path-only or path+metadata).\n\n## `findAllNearestProjectConfigDirs(subpath, cwd)`\n\nWalks parent directories upward and returns the **nearest existing directory per source base** (`.omp`, `.claude`, `.codex`, `.gemini`), then sorts results by source priority.\n\nUse this when project config should be inherited from ancestor directories (monorepo/nested workspace behavior).\n\n---\n\n## 3) File config wrapper (`ConfigFile` in `src/config.ts`)\n\n`ConfigFile` is the schema-validated loader for single config files.\n\nSupported formats:\n\n- `.yml` / `.yaml`\n- `.json` / `.jsonc`\n\nBehavior:\n\n- Validates parsed data against a provided Zod schema.\n- Caches load result until `invalidate()`.\n- Returns tri-state result via `tryLoad()`:\n - `ok`\n - `not-found`\n - `error` (`ConfigError` with schema/parse context)\n\nLegacy migration still supported:\n\n- If target path is `.yml`/`.yaml`, a sibling `.json` is auto-migrated once (`migrateJsonToYml`).\n\n---\n\n## 4) Settings resolution model (`src/config/settings.ts`)\n\nThe runtime settings model is layered:\n\n1. Global settings: `~/.omp/agent/config.yml`\n2. Project settings: discovered via settings capability (`settings.json` from providers)\n3. Runtime overrides: in-memory, non-persistent\n4. Schema defaults: from `SETTINGS_SCHEMA`\n\nEffective read path:\n\n`defaults <- global <- project <- overrides`\n\nWrite behavior:\n\n- `settings.set(...)` writes to the **global** layer (`config.yml`) and queues background save.\n- Project settings are read-only from capability discovery.\n\n## Migration behavior still active\n\nOn startup, if `config.yml` is missing:\n\n1. Migrate from `~/.omp/agent/settings.json` (renamed to `.bak` on success)\n2. Merge with legacy DB settings from `agent.db`\n3. Write merged result to `config.yml`\n\nField-level migrations in `#migrateRawSettings`:\n\n- `queueMode` -> `steeringMode`\n- `ask.timeout` milliseconds -> seconds when old value looks like ms (`> 1000`)\n- Legacy flat `theme: \"...\"` -> `theme.dark/theme.light` structure\n\n---\n\n## 5) Capability/discovery integration\n\nMost non-core config loading flows through the capability registry (`src/capability/index.ts` + `src/discovery/index.ts`).\n\n## Provider ordering\n\nProviders are sorted by numeric priority (higher first). Example priorities:\n\n- Native OMP (`builtin.ts`): `100`\n- Claude: `80`\n- Codex / agents / Claude marketplace: `70`\n- Gemini: `60`\n\n```text\nProvider precedence (higher wins)\n\nnative (.omp) priority 100\nclaude priority 80\ncodex / agents / ... priority 70\ngemini priority 60\n```\n\n## Dedup semantics\n\nCapabilities define a `key(item)`:\n\n- same key => first item wins (higher-priority/earlier-loaded item)\n- no key (`undefined`) => no dedup, all items retained\n\nRelevant keys:\n\n- skills: `name`\n- tools: `name`\n- hooks: `${type}:${tool}:${name}`\n- extension modules: `name`\n- extensions: `name`\n- settings: no dedup (all items preserved)\n\n---\n\n## 6) Native `.omp` provider behavior (`packages/coding-agent/src/discovery/builtin.ts`)\n\nNative provider (`id: native`) reads native config from:\n\n- project: `/.omp/...`\n- user: `~/.omp/agent/...`\n\n### Directory admission rules\n\n- Slash commands, rules, prompts, instructions, hooks, tools, extensions, extension modules, and settings use a project/user root only when the root directory exists and is non-empty.\n- Skills scan `/.omp/skills` for each ancestor from the current working directory up to the repo root/home boundary, plus `~/.omp/agent/skills`, without requiring the root `.omp` directory itself to be non-empty.\n- `SYSTEM.md` and `AGENTS.md` read user-level files directly and use nearest-ancestor project `.omp` lookup for project files, but the project `.omp` directory must be non-empty.\n\n### Scope-specific loading\n\n- Skills: `/.omp/skills/*/SKILL.md` and `~/.omp/agent/skills/*/SKILL.md`\n- Slash commands: `commands/*.md`\n- Rules: `rules/*.{md,mdc}`\n- Prompts: `prompts/*.md`\n- Instructions: `instructions/*.md`\n- Hooks: `hooks/pre/*`, `hooks/post/*`\n- Tools: `tools/*.{json,md,ts,js,sh,bash,py}` and `tools//index.ts`\n- Extension modules: discovered under `extensions/` (+ legacy `settings.json.extensions` string array)\n- Extensions: `extensions//gemini-extension.json`\n- Settings capability: `settings.json`\n\n### Nearest-project lookup nuance\n\n## For `SYSTEM.md` and `AGENTS.md`, native provider uses nearest-ancestor project `.omp` directory search (walk-up) and still requires the project `.omp` dir to be non-empty.\n\n## 7) How major subsystems consume config\n\n## Settings subsystem\n\n- `Settings.init()` loads global `config.yml` + discovered project `settings.json` capability items.\n- Only capability items with `level === \"project\"` are merged into project layer.\n\n## Skills subsystem\n\n- `extensibility/skills.ts` loads via `loadCapability(skillCapability.id, { cwd })`.\n- Applies source toggles and filters (`ignoredSkills`, `includeSkills`, custom dirs).\n- Legacy-named toggles still exist (`skills.enablePiUser`, `skills.enablePiProject`) but they gate the native provider (`provider === \"native\"`).\n\n## Hooks subsystem\n\n- `discoverAndLoadHooks()` resolves hook paths from hook capability + explicit configured paths.\n- Then loads modules via Bun import.\n\n## Tools subsystem\n\n- `discoverAndLoadCustomTools()` resolves tool paths from tool capability + plugin tool paths + explicit configured paths.\n- Declarative `.md/.json` tool files are metadata only; executable loading expects code modules.\n\n## Extensions subsystem\n\n- `discoverAndLoadExtensions()` resolves extension modules from extension-module capability plus explicit paths.\n- Current implementation intentionally keeps only capability items with `_source.provider === \"native\"` before loading.\n\n---\n\n## 8) Precedence rules to rely on\n\nUse this mental model:\n\n1. Source directory ordering from `config.ts` determines candidate path order.\n2. Capability provider priority determines cross-provider precedence.\n3. Capability key dedup determines collision behavior (first wins for keyed capabilities).\n4. Subsystem-specific merge logic can further change effective precedence (especially settings).\n\n### Settings-specific caveat\n\nSettings capability items are not deduplicated; `Settings.#loadProjectSettings()` deep-merges project items in returned order. Because merge applies later item values over earlier values, effective override behavior depends on provider emission order, not just capability key semantics.\n\n---\n\n## 9) Legacy/compatibility behaviors still present\n\n- `ConfigFile` JSON -> YAML migration for YAML-targeted files.\n- Settings migration from `settings.json` and `agent.db` to `config.yml`.\n- Settings key migrations (`queueMode`, `ask.timeout`, flat `theme`, `task.isolation.enabled`, `statusLine.plan_mode`).\n- Legacy setting names `skills.enablePiUser` / `skills.enablePiProject` are still active gates for native skill source.\n\nIf these compatibility paths are removed in code, update this document immediately; several runtime behaviors still depend on them today.\n", "custom-tools.md": "# Custom Tools\n\nCustom tools are model-callable functions that plug into the same tool execution pipeline as built-in tools.\n\nA custom tool is a TypeScript/JavaScript module that exports a factory. The factory receives a host API (`CustomToolAPI`) and returns one tool or an array of tools.\n\n## What this is (and is not)\n\n- **Custom tool**: callable by the model during a turn (`execute` + Zod parameter schema).\n- **Extension**: lifecycle/event framework that can register tools and intercept/modify events.\n- **Hook**: external pre/post command scripts.\n- **Skill**: static guidance/context package, not executable tool code.\n\nIf you need the model to call code directly, use a custom tool.\n\n## Integration paths in current code\n\nThere are two active integration styles:\n\n1. **SDK-provided custom tools** (`options.customTools`)\n - Wrapped into agent tools via `CustomToolAdapter` or extension wrappers.\n - Always included in the initial active tool set in SDK bootstrap.\n\n2. **Filesystem-discovered modules via loader API** (`discoverAndLoadCustomTools` / `loadCustomTools`)\n - Exposed as library APIs in `src/extensibility/custom-tools/loader.ts`.\n - Host code can call these to discover and load tool modules from config/provider/plugin paths.\n\n```text\nModel tool call flow\n\nLLM tool call\n │\n ▼\nTool registry (built-ins + custom tool adapters)\n │\n ▼\nCustomTool.execute(toolCallId, params, onUpdate, ctx, signal)\n │\n ├─ onUpdate(...) -> streamed partial result\n └─ return result -> final tool content/details\n```\n\n## Discovery locations (loader API)\n\n`discoverAndLoadCustomTools(configuredPaths, cwd, builtInToolNames)` merges:\n\n1. Capability providers (`toolCapability`), including:\n - Native OMP config (`~/.omp/agent/tools`, `.omp/tools`)\n - Claude config (`~/.claude/tools`, `.claude/tools`)\n - Codex config (`~/.codex/tools`, `.codex/tools`)\n - Claude marketplace plugin cache provider\n2. Installed plugin manifests (`~/.omp/plugins/node_modules/*` via plugin loader)\n3. Explicit configured paths passed to the loader\n\n### Important behavior\n\n- Duplicate resolved paths are deduplicated.\n- Tool name conflicts are rejected against built-ins and already-loaded custom tools.\n- `.md` and `.json` files are discovered as tool metadata by some providers, but the executable module loader rejects them as runnable tools.\n- Relative configured paths are resolved from `cwd`; `~` is expanded.\n\n## Module contract\n\nA custom tool module must export a function (default export preferred):\n\n```ts\nimport type { CustomToolFactory } from \"@oh-my-pi/pi-coding-agent\";\n\nconst factory: CustomToolFactory = (pi) => ({\n\tname: \"repo_stats\",\n\tlabel: \"Repo Stats\",\n\tdescription: \"Counts tracked TypeScript files\",\n\tparameters: pi.zod.object({\n\t\tglob: pi.zod.string().optional().default(\"**/*.ts\"),\n\t}),\n\n\tasync execute(toolCallId, params, onUpdate, ctx, signal) {\n\t\tonUpdate?.({\n\t\t\tcontent: [{ type: \"text\", text: \"Scanning files...\" }],\n\t\t\tdetails: { phase: \"scan\" },\n\t\t});\n\n\t\tconst result = await pi.exec(\"git\", [\"ls-files\", params.glob ?? \"**/*.ts\"], { signal, cwd: pi.cwd });\n\t\tif (result.killed) {\n\t\t\tthrow new Error(\"Scan was cancelled\");\n\t\t}\n\t\tif (result.code !== 0) {\n\t\t\tthrow new Error(result.stderr || \"git ls-files failed\");\n\t\t}\n\n\t\tconst files = result.stdout.split(\"\\n\").filter(Boolean);\n\t\treturn {\n\t\t\tcontent: [{ type: \"text\", text: `Found ${files.length} files` }],\n\t\t\tdetails: { count: files.length, sample: files.slice(0, 10) },\n\t\t};\n\t},\n\n\tonSession(event) {\n\t\tif (event.reason === \"shutdown\") {\n\t\t\t// cleanup resources if needed\n\t\t}\n\t},\n});\n\nexport default factory;\n```\n\nSchemas are authored with Zod (`pi.zod`) and flow through the shared validation/wire pipeline.\n\nFactory return type:\n\n- `CustomTool`\n- `CustomTool[]`\n- `Promise`\n\n## API surface passed to factories (`CustomToolAPI`)\n\nFrom `types.ts` and `loader.ts`:\n\n- `cwd`: host working directory\n- `exec(command, args, options?)`: process execution helper\n- `ui`: UI context (can be no-op in headless modes)\n- `hasUI`: `false` in non-interactive flows\n- `logger`: shared file logger\n- `zod`: injected `zod` module (use `pi.zod.object`, `pi.zod.string`, …)\n- `pi`: injected `@oh-my-pi/pi-coding-agent` exports\n- `pushPendingAction(action)`: register a preview action for hidden `resolve` tool (`docs/resolve-tool-runtime.md`)\n\nLoader starts with a no-op UI context and requires host code to call `setUIContext(...)` when real UI is ready.\n\n## Execution contract and typing\n\n`CustomTool.execute` signature:\n\n```ts\nexecute(toolCallId, params, onUpdate, ctx, signal);\n```\n\n- `params` is statically typed from your Zod schema via `z.infer` (`Static` in API types).\n- Runtime argument validation happens before execution in the agent loop.\n- `onUpdate` emits partial results for UI streaming.\n- `ctx` includes session/model state and an `abort()` helper.\n- `signal` carries cancellation.\n\n`CustomToolAdapter` bridges this to the agent tool interface and forwards calls in the correct argument order.\n\n## How tools are exposed to the model\n\n- Tools are wrapped into `AgentTool` instances (`CustomToolAdapter` or extension wrappers).\n- They are inserted into the session tool registry by name.\n- In SDK bootstrap, custom and extension-registered tools are force-included in the initial active set.\n- CLI `--tools` currently validates only built-in tool names; custom tool inclusion is handled through discovery/registration paths and SDK options.\n\n## Rendering hooks\n\nOptional rendering hooks:\n\n- `renderCall(args, options, theme)`\n- `renderResult(result, options, theme, args?)`\n\nRuntime behavior in TUI:\n\n- If hooks exist, tool output is rendered inside a `Box` container.\n- `renderResult` receives `{ expanded, isPartial, spinnerFrame? }`.\n- Renderer errors are caught and logged; UI falls back to default text rendering.\n\n## Session/state handling\n\nOptional `onSession(event, ctx)` receives session lifecycle events, including:\n\n- `start`, `switch`, `branch`, `tree`, `shutdown`\n- `auto_compaction_start`, `auto_compaction_end`\n- `auto_retry_start`, `auto_retry_end`\n- `ttsr_triggered`, `todo_reminder`\n\nUse `ctx.sessionManager` to reconstruct state from history when branch/session context changes.\n\n## Failures and cancellation semantics\n\n### Synchronous/async failures\n\n- Throwing (or rejected promises) in `execute` is treated as tool failure.\n- Agent runtime converts failures into tool result messages with `isError: true` and error text content.\n- With extension wrappers, `tool_result` handlers can further rewrite content/details and even override error status.\n\n### Cancellation\n\n- Agent abort propagates through `AbortSignal` to `execute`.\n- Forward `signal` to subprocess work (`pi.exec(..., { signal })`) for cooperative cancellation.\n- `ctx.abort()` lets a tool request abort of the current agent operation.\n\n### onSession errors\n\n- `onSession` errors are caught and logged as warnings; they do not crash the session.\n\n## Real constraints to design for\n\n- Tool names must be globally unique in the active registry.\n- Prefer deterministic, schema-shaped outputs in `details` for renderer/state reconstruction.\n- Guard UI usage with `pi.hasUI`.\n- Treat `.md`/`.json` in tool directories as metadata, not executable modules.\n", "environment-variables.md": "# Environment Variables (Current Runtime Reference)\n\nThis reference is derived from current code paths in:\n\n- `packages/coding-agent/src/**`\n- `packages/ai/src/**` (provider/auth resolution used by coding-agent)\n- `packages/utils/src/**` and `packages/tui/src/**` where those vars directly affect coding-agent runtime\n\nIt documents only active behavior.\n\n## Resolution model and precedence\n\nMost runtime lookups use `$env` from `@oh-my-pi/pi-utils` (`packages/utils/src/env.ts`).\n\n`$env` loading order:\n\n1. Existing process environment (`Bun.env`)\n2. Project `.env` (`$PWD/.env`) for keys not already set\n3. Agent `.env` (`~/.omp/agent/.env`, respecting `PI_CONFIG_DIR` / `PI_CODING_AGENT_DIR`) for keys not already set\n4. Config-root `.env` (`~/.omp/.env`, respecting `PI_CONFIG_DIR`) for keys not already set\n5. Home `.env` (`~/.env`) for keys not already set\n\nAdditional rule inside each `.env` file: `OMP_*` keys are mirrored to `PI_*` keys in that parsed file.\n\n---\n\n## 1) Model/provider authentication\n\nThese are consumed via `getEnvApiKey()` (`packages/ai/src/stream.ts`) unless noted otherwise.\n\n### Core provider credentials\n\n| Variable | Used for | Required when | Notes / precedence |\n| ------------------------------- | ------------------------------------------------ | -------------------------------------------------------------- | --------------------------------------------------------------------------------------------------- |\n| `ANTHROPIC_OAUTH_TOKEN` | Anthropic API auth | Using Anthropic with OAuth token auth | Takes precedence over `ANTHROPIC_API_KEY` for provider auth resolution |\n| `ANTHROPIC_API_KEY` | Anthropic API auth | Using Anthropic without OAuth token | Fallback after `ANTHROPIC_OAUTH_TOKEN` |\n| `ANTHROPIC_FOUNDRY_API_KEY` | Anthropic via Azure Foundry / enterprise gateway | `CLAUDE_CODE_USE_FOUNDRY` enabled | Takes precedence over `ANTHROPIC_OAUTH_TOKEN` and `ANTHROPIC_API_KEY` when Foundry mode is enabled |\n| `OPENAI_API_KEY` | OpenAI auth | Using OpenAI-family providers without explicit apiKey argument | Used by OpenAI Completions/Responses providers |\n| `GEMINI_API_KEY` | Google Gemini auth | Using `google` provider models | Primary key for Gemini provider mapping |\n| `GOOGLE_API_KEY` | Gemini image tool auth fallback | Using `gemini_image` tool without `GEMINI_API_KEY` | Used by coding-agent image tool fallback path |\n| `GROQ_API_KEY` | Groq auth | Using Groq models | |\n| `CEREBRAS_API_KEY` | Cerebras auth | Using Cerebras models | |\n| `FIREWORKS_API_KEY` | Fireworks auth | Using Fireworks models | |\n| `TOGETHER_API_KEY` | Together auth | Using `together` provider | |\n| `HUGGINGFACE_HUB_TOKEN` | Hugging Face auth | Using `huggingface` provider | Primary Hugging Face token env var |\n| `HF_TOKEN` | Hugging Face auth | Using `huggingface` provider | Fallback when `HUGGINGFACE_HUB_TOKEN` is unset |\n| `SYNTHETIC_API_KEY` | Synthetic auth | Using Synthetic models | |\n| `NVIDIA_API_KEY` | NVIDIA auth | Using `nvidia` provider | |\n| `NANO_GPT_API_KEY` | NanoGPT auth | Using `nanogpt` provider | |\n| `VENICE_API_KEY` | Venice auth | Using `venice` provider | |\n| `LITELLM_API_KEY` | LiteLLM auth | Using `litellm` provider | OpenAI-compatible LiteLLM proxy key |\n| `LM_STUDIO_API_KEY` | LM Studio auth (optional) | Using `lm-studio` provider with authenticated hosts | Local LM Studio usually runs without auth; any non-empty token works when a key is required |\n| `OLLAMA_API_KEY` | Ollama auth (optional) | Using `ollama` provider with authenticated hosts | Local Ollama usually runs without auth; any non-empty token works when a key is required |\n| `LLAMA_CPP_API_KEY` | llama.cpp auth (optional) | Using `llama.cpp` provider with authenticated hosts | Local llama.cpp usually runs without auth; any non-empty token works when a key is configured |\n| `XIAOMI_API_KEY` | Xiaomi MiMo auth | Using `xiaomi` provider | |\n| `MOONSHOT_API_KEY` | Moonshot auth | Using `moonshot` provider | |\n| `XAI_API_KEY` | xAI auth | Using xAI models | |\n| `OPENROUTER_API_KEY` | OpenRouter auth | Using OpenRouter models | Also used by image tool when preferred/auto provider is OpenRouter |\n| `MISTRAL_API_KEY` | Mistral auth | Using Mistral models | |\n| `ZAI_API_KEY` | z.ai auth | Using z.ai models | Also used by z.ai web search provider |\n| `MINIMAX_API_KEY` | MiniMax auth | Using `minimax` provider | |\n| `MINIMAX_CODE_API_KEY` | MiniMax Code auth | Using `minimax-code` provider | |\n| `MINIMAX_CODE_CN_API_KEY` | MiniMax Code CN auth | Using `minimax-code-cn` provider | |\n| `OPENCODE_API_KEY` | OpenCode auth | Using `opencode-go` / `opencode-zen` models | |\n| `QIANFAN_API_KEY` | Qianfan auth | Using `qianfan` provider | |\n| `QWEN_OAUTH_TOKEN` | Qwen Portal auth | Using `qwen-portal` with OAuth token | Takes precedence over `QWEN_PORTAL_API_KEY` |\n| `QWEN_PORTAL_API_KEY` | Qwen Portal auth | Using `qwen-portal` with API key | Fallback after `QWEN_OAUTH_TOKEN` |\n| `ZENMUX_API_KEY` | ZenMux auth | Using `zenmux` provider | Used for ZenMux OpenAI and Anthropic-compatible routes |\n| `VLLM_API_KEY` | vLLM auth/discovery opt-in | Using `vllm` provider (local OpenAI-compatible servers) | Any non-empty value works for no-auth local servers |\n| `CURSOR_ACCESS_TOKEN` | Cursor provider auth | Using Cursor provider | |\n| `AI_GATEWAY_API_KEY` | Vercel AI Gateway auth | Using `vercel-ai-gateway` provider | |\n| `CLOUDFLARE_AI_GATEWAY_API_KEY` | Cloudflare AI Gateway auth | Using `cloudflare-ai-gateway` provider | Base URL must be configured as `https://gateway.ai.cloudflare.com/v1///anthropic` |\n| `ALIBABA_CODING_PLAN_API_KEY` | Alibaba Coding Plan auth | Using `alibaba-coding-plan` provider | |\n| `DEEPSEEK_API_KEY` | DeepSeek auth | Using DeepSeek models | |\n| `KILO_API_KEY` | Kilo auth | Using Kilo models | |\n| `OLLAMA_CLOUD_API_KEY` | Ollama Cloud auth | Using `ollama-cloud` provider | |\n| `GITLAB_TOKEN` | GitLab Duo auth | Using `gitlab-duo` provider | |\n\n### GitHub/Copilot token chains\n\n| Variable | Used for | Chain |\n| ---------------------- | ------------------------------------------------ | ---------------------------------------------------- |\n| `COPILOT_GITHUB_TOKEN` | GitHub Copilot provider auth | `COPILOT_GITHUB_TOKEN` → `GH_TOKEN` → `GITHUB_TOKEN` |\n| `GH_TOKEN` | Copilot fallback; GitHub API auth in web scraper | In web scraper: `GITHUB_TOKEN` → `GH_TOKEN` |\n| `GITHUB_TOKEN` | Copilot fallback; GitHub API auth in web scraper | In web scraper: checked before `GH_TOKEN` |\n\n### Auth broker / auth gateway (remote credential vault)\n\nWhen the broker is enabled, the local SQLite credential store is bypassed and all OAuth refresh / access tokens live on the broker host. See [`auth-broker-gateway.md`](./auth-broker-gateway.md) for the full protocol, CLI surface, and 5-min/15-s usage cache layering.\n\n| Variable | Used for | Required when | Notes / precedence |\n| ----------------------- | ------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |\n| `OMP_AUTH_BROKER_URL` | Base URL of the remote auth-broker (e.g. `https://broker.tailnet:8765`); selects broker mode | Resolving credentials through a broker; also required by `omp auth-gateway serve` (the gateway is itself a broker client) | Wins over `auth.broker.url` in `config.yml`. When set with no resolvable token, `resolveAuthBrokerConfig()` hard-errors instead of falling back to local SQLite. |\n| `OMP_AUTH_BROKER_TOKEN` | Bearer token sent on every broker endpoint except `/v1/healthz` | `OMP_AUTH_BROKER_URL` is set and no token is available from `auth.broker.token` or `/auth-broker.token` | Resolution: this env → `auth.broker.token` (`$ENV_NAME` indirection supported) → `/auth-broker.token` (mode `0600`). `` is `~/.omp/` (respecting `PI_CONFIG_DIR`). |\n\nThe gateway has no dedicated env vars — it inherits `OMP_AUTH_BROKER_*`. Its own inbound bearer token lives at `/auth-gateway.token` and is managed via `omp auth-gateway token`.\n\n---\n\n## 2) Provider-specific runtime configuration\n\n### Anthropic Foundry Gateway (Azure / enterprise proxy)\n\nWhen `CLAUDE_CODE_USE_FOUNDRY` is enabled, Anthropic requests switch to Foundry mode:\n\n- Base URL resolves from `FOUNDRY_BASE_URL` (fallback remains model/default base URL if unset).\n- API key resolution for provider `anthropic` becomes:\n `ANTHROPIC_FOUNDRY_API_KEY` → `ANTHROPIC_OAUTH_TOKEN` → `ANTHROPIC_API_KEY`.\n- `ANTHROPIC_CUSTOM_HEADERS` is parsed as comma/newline-separated `key: value` pairs and merged into request headers.\n- TLS client/server material can be injected from env values:\n `NODE_EXTRA_CA_CERTS`, `CLAUDE_CODE_CLIENT_CERT`, `CLAUDE_CODE_CLIENT_KEY`.\n Each accepts either:\n - a filesystem path to PEM content, or\n - inline PEM (including escaped `\\n` sequences).\n\n| Variable | Value type | Behavior |\n| --------------------------- | ---------------------------------------------- | ----------------------------------------------------------------------------- |\n| `CLAUDE_CODE_USE_FOUNDRY` | Boolean-like string (`1`, `true`, `yes`, `on`) | Enables Foundry mode for Anthropic provider |\n| `FOUNDRY_BASE_URL` | URL string | Anthropic endpoint base URL in Foundry mode |\n| `ANTHROPIC_FOUNDRY_API_KEY` | Token string | Used for `Authorization: Bearer ` |\n| `ANTHROPIC_CUSTOM_HEADERS` | Header list string | Extra headers; format `header-a: value, header-b: value` or newline-separated |\n| `NODE_EXTRA_CA_CERTS` | PEM path or inline PEM | Extra CA chain for server certificate validation |\n| `CLAUDE_CODE_CLIENT_CERT` | PEM path or inline PEM | mTLS client certificate |\n| `CLAUDE_CODE_CLIENT_KEY` | PEM path or inline PEM | mTLS client private key (must be paired with cert) |\n\n### Amazon Bedrock\n\n| Variable | Default / behavior |\n| ------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------- |\n| `AWS_REGION` | Primary region source |\n| `AWS_DEFAULT_REGION` | Fallback if `AWS_REGION` unset |\n| `AWS_PROFILE` | Enables named profile auth path |\n| `AWS_ACCESS_KEY_ID` + `AWS_SECRET_ACCESS_KEY` | Enables IAM key auth path |\n| `AWS_BEARER_TOKEN_BEDROCK` | Enables bearer token auth path |\n| `AWS_CONTAINER_CREDENTIALS_RELATIVE_URI` / `AWS_CONTAINER_CREDENTIALS_FULL_URI` | Enables ECS task credential path |\n| `AWS_WEB_IDENTITY_TOKEN_FILE` + `AWS_ROLE_ARN` | Enables web identity auth path |\n| `AWS_BEDROCK_SKIP_AUTH` | If `1`, injects dummy credentials (proxy/non-auth scenarios) |\n| `AWS_BEDROCK_FORCE_HTTP1` | If `1`, forces Node HTTP/1 request handler |\n| `HTTPS_PROXY` / `HTTP_PROXY` / `ALL_PROXY` | Routes Bedrock runtime and AWS SSO credential calls through the configured proxy using HTTP/1 |\n| `NO_PROXY` | Excludes matching hosts from proxy routing when a proxy variable is configured |\n\nRegion fallback in provider code: `options.region` → `AWS_REGION` → `AWS_DEFAULT_REGION` → `us-east-1`.\n\n### Azure OpenAI Responses\n\n| Variable | Default / behavior |\n| ---------------------------------- | --------------------------------------------------------------------------- |\n| `AZURE_OPENAI_API_KEY` | Required unless API key passed as option |\n| `AZURE_OPENAI_API_VERSION` | Default `v1` |\n| `AZURE_OPENAI_BASE_URL` | Direct base URL override |\n| `AZURE_OPENAI_RESOURCE_NAME` | Used to construct base URL: `https://.openai.azure.com/openai/v1` |\n| `AZURE_OPENAI_DEPLOYMENT_NAME_MAP` | Optional mapping string: `modelId=deploymentName,model2=deployment2` |\n\nBase URL resolution: option `azureBaseUrl` → env `AZURE_OPENAI_BASE_URL` → option/env resource name → `model.baseUrl`.\n\n### Google Vertex AI\n\n| Variable | Required? | Notes |\n| -------------------------------- | ------------------------------ | ------------------------------------------------------------------------------------------------------------------------- |\n| `GOOGLE_CLOUD_PROJECT` | Yes (unless passed in options) | Fallback: `GCLOUD_PROJECT` |\n| `GCLOUD_PROJECT` | Fallback | Used as alternate project ID source |\n| `GOOGLE_CLOUD_PROJECT_ID` | OAuth login helper only | Used by Gemini CLI OAuth project discovery |\n| `GOOGLE_CLOUD_LOCATION` | Yes (unless passed in options) | No default in provider |\n| `GOOGLE_CLOUD_API_KEY` | Conditional | Direct Vertex API-key auth; otherwise ADC fallback can authenticate when project and location are set |\n| `GOOGLE_APPLICATION_CREDENTIALS` | Conditional | If set, file must exist; otherwise ADC fallback path is checked (`~/.config/gcloud/application_default_credentials.json`) |\n\n### Kimi\n\n| Variable | Default / behavior |\n| ---------------------- | -------------------------------------------------------- |\n| `KIMI_CODE_OAUTH_HOST` | Primary OAuth host override |\n| `KIMI_OAUTH_HOST` | Fallback OAuth host override |\n| `KIMI_CODE_BASE_URL` | Overrides Kimi usage endpoint base URL (`usage/kimi.ts`) |\n\nOAuth host chain: `KIMI_CODE_OAUTH_HOST` → `KIMI_OAUTH_HOST` → `https://auth.kimi.com`.\n\n### Gemini CLI compatibility\n\n| Variable | Default / behavior |\n| -------------------------- | --------------------------------------------------------------- |\n| `PI_AI_GEMINI_CLI_VERSION` | Overrides Gemini CLI user-agent version tag (`0.35.3` if unset) |\n\n### OpenAI Codex responses (feature/debug controls)\n\n| Variable | Behavior |\n| ------------------------------------ | ---------------------------------------------------- |\n| `PI_CODEX_DEBUG` | `1`/`true` enables Codex provider debug logging |\n| `PI_CODEX_WEBSOCKET` | `1`/`true` enables websocket transport preference |\n| `PI_CODEX_WEBSOCKET_V2` | `1`/`true` enables websocket v2 path |\n| `PI_CODEX_WEBSOCKET_IDLE_TIMEOUT_MS` | Positive integer override (default 300000) |\n| `PI_CODEX_WEBSOCKET_RETRY_BUDGET` | Non-negative integer override (default 5) |\n| `PI_CODEX_WEBSOCKET_RETRY_DELAY_MS` | Positive integer base backoff override (default 500) |\n| `PI_OPENAI_STREAM_IDLE_TIMEOUT_MS` | Positive integer OpenAI stream idle timeout override |\n\n### Cursor provider debug\n\n| Variable | Behavior |\n| ------------------ | ------------------------------------------------------------------------ |\n| `DEBUG_CURSOR` | Enables provider debug logs; `2`/`verbose` for detailed payload snippets |\n| `DEBUG_CURSOR_LOG` | Optional file path for JSONL debug log output |\n\n### Prompt cache compatibility switch\n\n| Variable | Behavior |\n| -------------------- | ----------------------------------------------------------------------------------------------------------------- |\n| `PI_CACHE_RETENTION` | If `long`, enables long retention where supported (`anthropic`, `openai-responses`, Bedrock retention resolution) |\n\n---\n\n## 3) Web search subsystem\n\n### Search provider credentials\n\n| Variable | Used by |\n| --------------------------------------------------- | ------------------------------------------------------------- |\n| `EXA_API_KEY` | Exa search provider and Exa MCP tools |\n| `BRAVE_API_KEY` | Brave search provider |\n| `PERPLEXITY_API_KEY` | Perplexity search provider API-key mode |\n| `PERPLEXITY_COOKIES` | Perplexity cookie-auth search mode |\n| `TAVILY_API_KEY` | Tavily search provider |\n| `ZAI_API_KEY` | z.ai search provider (also checks stored OAuth in `agent.db`) |\n| `OPENAI_API_KEY` / Codex OAuth in DB | Codex search provider availability/auth |\n| `PI_CODEX_WEB_SEARCH_MODEL` | Codex search provider model override |\n| `MOONSHOT_SEARCH_API_KEY` / `KIMI_SEARCH_API_KEY` | Kimi/Moonshot search provider env auth |\n| `MOONSHOT_SEARCH_BASE_URL` / `KIMI_SEARCH_BASE_URL` | Kimi/Moonshot search endpoint override |\n| `KAGI_API_KEY` | Kagi search provider |\n| `JINA_API_KEY` | Jina search provider |\n| `PARALLEL_API_KEY` | Parallel search provider |\n| `SEARXNG_ENDPOINT`, `SEARXNG_TOKEN` | SearXNG endpoint and optional bearer token |\n| `SEARXNG_BASIC_USERNAME`, `SEARXNG_BASIC_PASSWORD` | SearXNG HTTP Basic Auth credentials |\n\nSearXNG also reads the equivalent `searxng.endpoint`, `searxng.token`, `searxng.basicUsername`, and `searxng.basicPassword` settings from `~/.omp/agent/config.yml`; environment variables are fallbacks.\n\n### Anthropic web search auth chain\n\nAnthropic web search uses `findAnthropicAuth()` from `packages/ai/src/utils/anthropic-auth.ts` in this order:\n\n1. `ANTHROPIC_SEARCH_API_KEY` (+ optional `ANTHROPIC_SEARCH_BASE_URL`)\n2. `ANTHROPIC_FOUNDRY_API_KEY` when `CLAUDE_CODE_USE_FOUNDRY` is enabled\n3. Anthropic OAuth credentials from `agent.db` (must not expire within 5-minute buffer)\n4. Anthropic API-key credentials from `agent.db`\n5. Generic Anthropic env fallback: provider key (`ANTHROPIC_FOUNDRY_API_KEY` in Foundry mode, otherwise `ANTHROPIC_OAUTH_TOKEN`/`ANTHROPIC_API_KEY`) + optional `ANTHROPIC_BASE_URL` (`FOUNDRY_BASE_URL` when Foundry mode is enabled)\n\nRelated vars:\n\n| Variable | Default / behavior |\n| --------------------------- | ---------------------------------------------------- |\n| `ANTHROPIC_SEARCH_API_KEY` | Highest-priority explicit search key |\n| `ANTHROPIC_SEARCH_BASE_URL` | Defaults to `https://api.anthropic.com` when omitted |\n| `ANTHROPIC_SEARCH_MODEL` | Defaults to `claude-haiku-4-5` |\n| `ANTHROPIC_BASE_URL` | Generic fallback base URL for tier-4 auth path |\n\n### Perplexity OAuth flow behavior flag\n\n| Variable | Behavior |\n| ------------------- | ------------------------------------------------------------------------------- |\n| `PI_AUTH_NO_BORROW` | If set, disables macOS native-app token borrowing path in Perplexity login flow |\n\n---\n\n## 4) Python tooling and kernel runtime\n\n| Variable | Default / behavior |\n| ------------------------- | ------------------------------------------------------------------------------------------------------------------- |\n| `PI_PY` | Eval backend override: `0`/`bash`=JavaScript only, `1`/`py`=Python only, `mix`/`both`=both; invalid values ignored |\n| `PI_PYTHON_SKIP_CHECK` | If `1`, skips Python interpreter availability checks (subprocess runner still starts on demand) |\n| `PI_PYTHON_INTEGRATION` | If `1`, opts gated integration tests in (e.g. `python-runner.integration.test.ts`) into running against real Python |\n| `PI_PYTHON_IPC_TRACE` | If `1`, logs NDJSON frames exchanged with the Python runner subprocess |\n| `VIRTUAL_ENV` | Highest-priority venv path for Python runtime resolution |\n\nExtra conditional behavior:\n\n- If `BUN_ENV=test` or `NODE_ENV=test`, Python availability checks are treated as OK and warming is skipped.\n- Python env filtering denies common API keys and allows safe base vars + `LC_`, `XDG_`, `PI_` prefixes.\n\n---\n\n## 5) Agent/runtime behavior toggles\n\n| Variable | Default / behavior |\n| ---------------------------- | -------------------------------------------------------------------------------------------------- |\n| `PI_SMOL_MODEL` | Ephemeral model-role override for `smol` (CLI `--smol` takes precedence) |\n| `PI_SLOW_MODEL` | Ephemeral model-role override for `slow` (CLI `--slow` takes precedence) |\n| `PI_PLAN_MODEL` | Ephemeral model-role override for `plan` (CLI `--plan` takes precedence) |\n| `PI_NO_TITLE` | If set (any non-empty value), disables auto session title generation on first user message |\n| `NULL_PROMPT` | If `true`, system prompt builder returns empty string |\n| `PI_BLOCKED_AGENT` | Blocks a specific subagent type in task tool |\n| `PI_SUBPROCESS_CMD` | Overrides subagent spawn command (`omp` / `omp.cmd` resolution bypass) |\n| `PI_TASK_MAX_OUTPUT_BYTES` | Max captured output bytes per subagent (default `500000`) |\n| `PI_TASK_MAX_OUTPUT_LINES` | Max captured output lines per subagent (default `5000`) |\n| `PI_TIMING` | If set (any non-empty value), prints a hierarchical timing-span tree to **stderr** via `logger.printTimings()`. In interactive mode the tree prints once the agent is ready (before the TUI starts); in print mode it prints after the whole prompt batch completes. Print-mode prompts are wrapped in `print:prompt:initial` / `print:prompt:next` spans so each user message shows up as its own row. `PI_TIMING=x` exits the process with code 0 right after printing in interactive mode (use to measure cold startup only). `PI_TIMING=full` lists every module-load entry instead of just the top N. |\n| `PI_PACKAGE_DIR` | Overrides package asset base dir resolution (docs/examples/changelog path lookup) |\n| `PI_DISABLE_LSPMUX` | If `1`, disables lspmux detection/integration and forces direct LSP server spawning |\n| `PI_RPC_EMIT_TITLE` | Boolean-like flag enabling title events in RPC mode |\n| `SMITHERY_URL` | Smithery web URL override (default `https://smithery.ai`) |\n| `SMITHERY_API_URL` | Smithery API base URL override (default `https://api.smithery.ai`) |\n| `PUPPETEER_EXECUTABLE_PATH` | Browser tool Chromium executable override |\n| `LM_STUDIO_BASE_URL` | Default implicit LM Studio discovery base URL override (`http://127.0.0.1:1234/v1` if unset) |\n| `OLLAMA_BASE_URL` | Default implicit Ollama discovery base URL override (`http://127.0.0.1:11434` if unset) |\n| `LLAMA_CPP_BASE_URL` | Default implicit Llama.cpp discovery base URL override (`http://127.0.0.1:8080` if unset) |\n| `PI_EDIT_VARIANT` | Forces edit tool variant when valid (`patch`, `replace`, `hashline`, `atom`, `vim`, `apply_patch`) |\n| `PI_FORCE_IMAGE_PROTOCOL` | Forces supported image protocol (`kitty`, `iterm2`/`iterm`, `sixel`, `none`) where used |\n| `PI_ALLOW_SIXEL_PASSTHROUGH` | Allows SIXEL passthrough when `PI_FORCE_IMAGE_PROTOCOL=sixel` |\n| `PI_NO_PTY` | If `1`, disables interactive PTY path for bash tool |\n\n`PI_NO_PTY` is also set internally when CLI `--no-pty` is used.\n\n---\n\n## 6) Storage and config root paths\n\nThese are consumed via `@oh-my-pi/pi-utils/dirs` and affect where coding-agent stores data.\n\n| Variable | Default / behavior |\n| --------------------- | ----------------------------------------------------------------------------- |\n| `PI_CONFIG_DIR` | Config root dirname under home (default `.omp`) |\n| `PI_CODING_AGENT_DIR` | Full override for agent directory (default `~//agent`) |\n| `PWD` | Used when matching canonical current working directory in path helpers |\n\n---\n\n## 7) Shell/tool execution environment\n\n(From `packages/utils/src/procmgr.ts` and coding-agent bash tool integration.)\n\n| Variable | Behavior |\n| -------------------------- | ------------------------------------------------------------------------------ |\n| `PI_BASH_NO_CI` | Suppresses automatic `CI=true` injection into spawned shell env |\n| `CLAUDE_BASH_NO_CI` | Legacy alias fallback for `PI_BASH_NO_CI` |\n| `PI_BASH_NO_LOGIN` | Disables login-shell mode; shell args become `['-c']` instead of `['-l','-c']` |\n| `CLAUDE_BASH_NO_LOGIN` | Legacy alias fallback for `PI_BASH_NO_LOGIN` |\n| `PI_SHELL_PREFIX` | Optional command prefix wrapper |\n| `CLAUDE_CODE_SHELL_PREFIX` | Legacy alias fallback for `PI_SHELL_PREFIX` |\n| `VISUAL` | Preferred external editor command |\n| `EDITOR` | Fallback external editor command |\n\nCurrent implementation: `PI_BASH_NO_LOGIN`/`CLAUDE_BASH_NO_LOGIN` are active; when either is set, `getShellArgs()` returns `['-c']`.\n\n---\n\n## 8) UI/theme/session detection (auto-detected env)\n\nThese are read as runtime signals; they are usually set by the terminal/OS rather than manually configured.\n\n| Variable | Used for |\n| ------------------------------------------------------------------------------------------------------------------ | --------------------------------------------------------- |\n| `COLORTERM`, `TERM`, `WT_SESSION` | Color capability detection (theme color mode) |\n| `COLORFGBG` | Terminal background light/dark auto-detection |\n| `TERM_PROGRAM`, `TERM_PROGRAM_VERSION`, `TERMINAL_EMULATOR` | Terminal identity in system prompt/context |\n| `KDE_FULL_SESSION`, `XDG_CURRENT_DESKTOP`, `DESKTOP_SESSION`, `XDG_SESSION_DESKTOP`, `GDMSESSION`, `WINDOWMANAGER` | Desktop/window-manager detection in system prompt/context |\n| `KITTY_WINDOW_ID`, `TMUX_PANE`, `TERM_SESSION_ID`, `WT_SESSION` | Stable per-terminal session breadcrumb IDs |\n| `SHELL`, `ComSpec`, `TERM_PROGRAM`, `TERM` | System info diagnostics |\n| `APPDATA`, `XDG_CONFIG_HOME` | lspmux config path resolution |\n| `HOME` | Path shortening in MCP command UI |\n\n---\n\n## 9) TUI runtime flags (shared package, affects coding-agent UX)\n\n| Variable | Behavior |\n| ------------------------- | ------------------------------------------------------------------------------------- |\n| `PI_NOTIFICATIONS` | `off` / `0` / `false` suppress desktop notifications |\n| `PI_TUI_WRITE_LOG` | If set, logs TUI writes to file |\n| `PI_HARDWARE_CURSOR` | If `1`, enables hardware cursor mode |\n| `PI_CLEAR_ON_SHRINK` | If `1`, clears empty rows when content shrinks |\n| `PI_DEBUG_REDRAW` | If `1`, enables redraw debug logging |\n| `PI_TUI_DEBUG` | If `1`, enables deep TUI debug dump path |\n| `PI_FORCE_IMAGE_PROTOCOL` | Forces terminal image protocol detection (`kitty`, `iterm2`/`iterm`, `sixel`, `none`) |\n\n---\n\n## 10) Commit generation controls\n\n| Variable | Behavior |\n| ------------------------- | ------------------------------------------------------------------- |\n| `PI_COMMIT_TEST_FALLBACK` | If `true` (case-insensitive), force commit fallback generation path |\n| `PI_COMMIT_NO_FALLBACK` | If `true`, disables fallback when agent returns no proposal |\n| `PI_COMMIT_MAP_REDUCE` | If `false`, disables map-reduce commit analysis path |\n| `DEBUG` | If set, commit agent error stack traces are printed |\n\n---\n\n## Security-sensitive variables\n\nTreat these as secrets; do not log or commit them:\n\n- Provider/API keys and OAuth/bearer credentials (all `*_API_KEY`, `*_TOKEN`, OAuth access/refresh tokens)\n- Cloud credentials (`AWS_*`, `GOOGLE_APPLICATION_CREDENTIALS` path may expose service-account material)\n- Search/provider auth vars (`EXA_API_KEY`, `BRAVE_API_KEY`, `PERPLEXITY_API_KEY`, Anthropic search keys)\n- Foundry mTLS material (`CLAUDE_CODE_CLIENT_CERT`, `CLAUDE_CODE_CLIENT_KEY`, `NODE_EXTRA_CA_CERTS` when it points to private CA bundles)\n\nPython runtime also explicitly strips many common key vars before spawning kernel subprocesses (`packages/coding-agent/src/eval/py/runtime.ts`).\n", "extension-loading.md": "# Extension Loading (TypeScript/JavaScript Modules)\n\nThis document covers how the coding agent discovers and loads **extension modules** (`.ts`/`.js`) at startup.\n\nIt does **not** cover `gemini-extension.json` manifest extensions (documented separately).\n\n## What this subsystem does\n\nExtension loading builds a list of module entry files, imports each module with Bun, executes its factory, and returns:\n\n- loaded extension definitions\n- per-path load errors (without aborting the whole load)\n- a shared extension runtime object used later by `ExtensionRunner`\n\n## Primary implementation files\n\n- `src/extensibility/extensions/loader.ts` — path discovery + import/execution\n- `src/extensibility/extensions/index.ts` — public exports\n- `src/extensibility/extensions/runner.ts` — runtime/event execution after load\n- `src/discovery/builtin.ts` — native auto-discovery provider for extension modules\n- `src/config/settings.ts` — loads merged `extensions` / `disabledExtensions` settings\n\n---\n\n## Inputs to extension loading\n\n### 1) Auto-discovered native extension modules\n\n`discoverAndLoadExtensions()` first asks discovery providers for `extension-module` capability items, then keeps only provider `native` items.\n\nEffective native locations:\n\n- Project: `/.omp/extensions`\n- User: `~/.omp/agent/extensions`\n\nPath roots come from the native provider (`SOURCE_PATHS.native`).\n\nNotes:\n\n- Native auto-discovery is currently `.omp` based.\n- Legacy `.pi` is still accepted in `package.json` manifest keys (`pi.extensions`), but not as a native root here.\n\n### 2) Installed plugin extension entries\n\nAfter native auto-discovery, `discoverAndLoadExtensions()` appends extension entry points from enabled installed plugins via `getAllPluginExtensionPaths(cwd)`.\n\nPlugin extension entries come from package `omp.extensions` / `pi.extensions` manifests, including enabled feature entries.\n\n### 3) Explicitly configured paths\n\nAfter plugin extension entries, configured paths are appended and resolved.\n\nConfigured path sources in the main session startup path (`sdk.ts`):\n\n1. CLI-provided paths (`--extension/-e`, and `--hook` is also treated as an extension path)\n2. Settings `extensions` array (merged global + project settings)\n\nGlobal settings file:\n\n- `~/.omp/agent/config.yml` (or custom agent dir via `PI_CODING_AGENT_DIR`)\n\nProject settings file:\n\n- `/.omp/settings.json`\n\nExamples:\n\n```yaml\n# ~/.omp/agent/config.yml\nextensions:\n - ~/my-exts/safety.ts\n - ./local/ext-pack\n```\n\n```json\n{\n \"extensions\": [\"./.omp/extensions/my-extra\"]\n}\n```\n\n---\n\n## Enable/disable controls\n\n### Disable discovery\n\n- CLI: `--no-extensions`\n- SDK option: `disableExtensionDiscovery`\n\nBehavior split:\n\n- SDK: when `disableExtensionDiscovery=true`, it still loads `additionalExtensionPaths` via `loadExtensions()`.\n- CLI path building (`main.ts`) currently clears CLI extension paths when `--no-extensions` is set, so explicit `-e/--hook` are not forwarded in that mode.\n\n### Disable specific extension modules\n\n`disabledExtensions` setting filters by extension id format:\n\n- `extension-module:`\n\n`derivedName` is based on entry path (`getExtensionNameFromPath`), for example:\n\n- `/x/foo.ts` -> `foo`\n- `/x/bar/index.ts` -> `bar`\n\nExample:\n\n```yaml\ndisabledExtensions:\n - extension-module:foo\n```\n\n---\n\n## Path and entry resolution\n\n### Path normalization\n\nFor configured paths:\n\n1. Normalize unicode spaces\n2. Expand `~`\n3. If relative, resolve against current `cwd`\n\n### If configured path is a file\n\nIt is used directly as a module entry candidate.\n\n### If configured path is a directory\n\nResolution order:\n\n1. `package.json` in that directory with `omp.extensions` (or legacy `pi.extensions`) -> use declared entries\n2. `index.ts`\n3. `index.js`\n4. Otherwise scan one level for extension entries:\n - direct `*.ts` / `*.js`\n - subdir `index.ts` / `index.js`\n - subdir `package.json` with `omp.extensions` / `pi.extensions`\n\nRules and constraints:\n\n- no recursive discovery beyond one subdirectory level\n- declared `extensions` manifest entries are resolved relative to that package directory\n- declared entries are included only if file exists/access is allowed\n- in `*/index.{ts,js}` pairs, TypeScript is preferred over JavaScript\n- symlinks are treated as eligible files/directories\n\n### Ignore behavior differs by source\n\n- Native auto-discovery (`discoverExtensionModulePaths` in discovery helpers) uses native glob with `gitignore: true` and `hidden: false`.\n- Explicit configured directory scanning in `loader.ts` uses `readdir` rules and does **not** apply gitignore filtering.\n\n---\n\n## Load order and precedence\n\n`discoverAndLoadExtensions()` builds one ordered list and then calls `loadExtensions()`.\n\nOrder:\n\n1. Native auto-discovered modules\n2. Installed plugin extension entries\n3. Explicit configured paths (in provided order)\n\nIn `sdk.ts`, configured order is:\n\n1. CLI additional paths\n2. Settings `extensions`\n\nDe-duplication:\n\n- absolute path based\n- first seen path wins\n- later duplicates are ignored\n\nImplication: if the same module path is both auto-discovered and explicitly configured, it is loaded once at the first position (auto-discovered stage).\n\n---\n\n## Module import and factory contract\n\nEach candidate path is loaded with dynamic import:\n\n- `await import(resolvedPath)`\n- factory is `module.default ?? module`\n- factory must be a function (`ExtensionFactory`)\n\nIf export is not a function, that path fails with a structured error and loading continues.\n\n---\n\n## Failure handling and isolation\n\n### During loading\n\nPer extension path, failures are captured as `{ path, error }` and do not stop other paths from loading.\n\nCommon cases:\n\n- import failure / missing file\n- invalid factory export (non-function)\n- exception thrown while executing factory\n\n### Runtime isolation model\n\n- Extensions are **not sandboxed** (same process/runtime).\n- They share one `EventBus` and one `ExtensionRuntime` instance.\n- During load, runtime action methods intentionally throw `ExtensionRuntimeNotInitializedError`; action wiring happens later in `ExtensionRunner.initialize()`.\n\n### After loading\n\nWhen events run through `ExtensionRunner`, handler exceptions are caught and emitted as extension errors instead of crashing the runner loop.\n\n---\n\n## Minimal user/project layout examples\n\n### User-level\n\n```text\n~/.omp/agent/\n config.yml\n extensions/\n guardrails.ts\n audit/\n index.ts\n```\n\n### Project-level\n\n```text\n/\n .omp/\n settings.json\n extensions/\n checks/\n package.json\n lint-gates.ts\n```\n\n`checks/package.json`:\n\n```json\n{\n \"omp\": {\n \"extensions\": [\"./src/check-a.ts\", \"./src/check-b.js\"]\n }\n}\n```\n\nLegacy manifest key still accepted:\n\n```json\n{\n \"pi\": {\n \"extensions\": [\"./index.ts\"]\n }\n}\n```\n", "extensions.md": "# Extensions\n\nPrimary guide for authoring runtime extensions in `packages/coding-agent`.\n\nThis document covers the current extension runtime in:\n\n- `src/extensibility/extensions/types.ts`\n- `src/extensibility/extensions/runner.ts`\n- `src/extensibility/extensions/wrapper.ts`\n- `src/extensibility/extensions/index.ts`\n- `src/modes/controllers/extension-ui-controller.ts`\n\nFor discovery paths and filesystem loading rules, see `docs/extension-loading.md`.\n\n## What an extension is\n\nAn extension is a TS/JS module exporting a default factory:\n\n```ts\nimport type { ExtensionAPI } from \"@oh-my-pi/pi-coding-agent\";\n\nexport default function myExtension(pi: ExtensionAPI) {\n // register handlers/tools/commands/renderers\n}\n```\n\nExtensions can combine all of the following in one module:\n\n- event handlers (`pi.on(...)`)\n- LLM-callable tools (`pi.registerTool(...)`)\n- slash commands (`pi.registerCommand(...)`)\n- keyboard shortcuts and flags\n- custom message rendering\n- session/message injection APIs (`sendMessage`, `sendUserMessage`, `appendEntry`)\n\n## Runtime model\n\n1. Extensions are imported and their factory functions run.\n2. During that load phase, registration methods are valid; runtime action methods are not yet initialized.\n3. `ExtensionRunner.initialize(...)` wires live actions/contexts for the active mode.\n4. Session/agent/tool lifecycle events are emitted to handlers.\n5. Every tool execution is wrapped with extension interception (`tool_call` / `tool_result`).\n\n```text\nExtension lifecycle (simplified)\n\nload paths\n │\n ▼\nimport module + run factory (registration only)\n │\n ▼\nExtensionRunner.initialize(mode/session/tool registry)\n │\n ├─ emit session/agent events to handlers\n ├─ wrap tool execution (tool_call/tool_result)\n └─ expose runtime actions (sendMessage, setActiveTools, ...)\n```\n\nImportant constraint from `loader.ts`:\n\n- calling action methods like `pi.sendMessage()` during extension load throws `ExtensionRuntimeNotInitializedError`\n- register first; perform runtime behavior from events/commands/tools\n\n## Quick start\n\n```ts\nimport type { ExtensionAPI } from \"@oh-my-pi/pi-coding-agent\";\n\nexport default function (pi: ExtensionAPI) {\n const { z } = pi.zod;\n\n pi.setLabel(\"Safety + Utilities\");\n\n pi.on(\"session_start\", async (_event, ctx) => {\n ctx.ui.notify(`Extension loaded in ${ctx.cwd}`, \"info\");\n });\n\n pi.on(\"tool_call\", async (event) => {\n if (event.toolName === \"bash\" && event.input.command?.includes(\"rm -rf\")) {\n return { block: true, reason: \"Blocked by extension policy\" };\n }\n });\n\n pi.registerTool({\n name: \"hello_extension\",\n label: \"Hello Extension\",\n description: \"Return a greeting\",\n parameters: z.object({ name: z.string() }),\n async execute(_toolCallId, params, _signal, _onUpdate, _ctx) {\n return {\n content: [{ type: \"text\", text: `Hello, ${params.name}` }],\n details: { greeted: params.name },\n };\n },\n });\n\n pi.registerCommand(\"hello-ext\", {\n description: \"Show queue state\",\n handler: async (_args, ctx) => {\n ctx.ui.notify(`pending=${ctx.hasPendingMessages()}`, \"info\");\n },\n });\n}\n```\n\n## Extension API surfaces\n\n## 1) Registration and actions (`ExtensionAPI`)\n\nCore methods:\n\n- `on(event, handler)`\n- `registerTool`, `registerCommand`, `registerShortcut`, `registerFlag`\n- `registerMessageRenderer`\n- `sendMessage`, `sendUserMessage`, `appendEntry`\n- `getActiveTools`, `getAllTools`, `setActiveTools`\n- `getSessionName`, `setSessionName`\n- `setModel`, `getThinkingLevel`, `setThinkingLevel`\n- `registerProvider`\n- `events` (shared event bus)\n\nIn interactive mode, `input` handlers run before the built-in first-message auto-title check. Extensions that call `await pi.setSessionName(...)` from `input` can set the persisted session name and prevent the default auto-generated title from running for that session.\n\nAlso exposed:\n\n- `pi.logger`\n- `pi.zod` (injected `zod` module — use for tool parameter schemas)\n- `pi.pi` (package exports)\n\n### Message delivery semantics\n\n`pi.sendMessage(message, options)` supports:\n\n- `deliverAs: \"steer\"` (default) — interrupts current run\n- `deliverAs: \"followUp\"` — queued to run after current run\n- `deliverAs: \"nextTurn\"` — stored and injected on the next user prompt\n- `triggerTurn: true` — starts a turn when idle (`nextTurn` ignores this)\n\n`pi.sendUserMessage(content, { deliverAs })` always goes through prompt flow; while streaming it queues as steer/follow-up.\n\n## 2) Handler context (`ExtensionContext`)\n\nHandlers and tool `execute` receive `ctx` with:\n\n- `ui`\n- `hasUI`\n- `cwd`\n- `sessionManager` (read-only)\n- `modelRegistry`, `model`\n- `getContextUsage()`\n- `compact(...)`\n- `isIdle()`, `hasPendingMessages()`, `abort()`\n- `shutdown()`\n- `getSystemPrompt()`\n\n## 3) Command context (`ExtensionCommandContext`)\n\nCommand handlers additionally get:\n\n- `waitForIdle()`\n- `newSession(...)`\n- `switchSession(...)`\n- `branch(entryId)`\n- `navigateTree(targetId, { summarize })`\n- `reload()`\n\nUse command context for session-control flows; these methods are intentionally separated from general event handlers.\n\n## Event surface (current names and behavior)\n\nCanonical event unions and payload types are in `types.ts`.\n\n### Session lifecycle\n\n- `session_start`\n- `session_before_switch` / `session_switch`\n- `session_before_branch` / `session_branch`\n- `session_before_compact` / `session.compacting` / `session_compact`\n- `session_before_tree` / `session_tree`\n- `session_shutdown`\n\nCancelable pre-events:\n\n- `session_before_switch` → `{ cancel?: boolean }`\n- `session_before_branch` → `{ cancel?: boolean; skipConversationRestore?: boolean }`\n- `session_before_compact` → `{ cancel?: boolean; compaction?: CompactionResult }`\n- `session_before_tree` → `{ cancel?: boolean; summary?: { summary: string; details?: unknown } }`\n\n### Prompt and turn lifecycle\n\n- `input`\n- `before_agent_start`\n- `context`\n- `agent_start` / `agent_end`\n- `turn_start` / `turn_end`\n- `message_start` / `message_update` / `message_end`\n\n### Tool lifecycle\n\n- `tool_call` (pre-exec, may block)\n- `tool_result` (post-exec, may patch content/details/isError)\n- `tool_execution_start` / `tool_execution_update` / `tool_execution_end` (observability)\n\n`tool_result` is middleware-style: handlers run in extension order and each sees prior modifications.\n\n### Reliability/runtime signals\n\n- `auto_compaction_start` / `auto_compaction_end`\n- `auto_retry_start` / `auto_retry_end`\n- `ttsr_triggered`\n- `todo_reminder`\n\n### User command interception\n\n- `user_bash` (override with `{ result }`)\n- `user_python` (override with `{ result }`)\n\n### `resources_discover`\n\n`resources_discover` exists in extension types and `ExtensionRunner`.\nCurrent runtime note: `ExtensionRunner.emitResourcesDiscover(...)` is implemented, but there are no `AgentSession` callsites invoking it in the current codebase.\n\n## Tool authoring details\n\n`registerTool` uses `ToolDefinition` from `types.ts`.\n\nCurrent `execute` signature:\n\n```ts\nexecute(\n\ttoolCallId,\n\tparams,\n\tsignal,\n\tonUpdate,\n\tctx,\n): Promise\n```\n\nTemplate:\n\n```ts\nconst { z } = pi.zod;\n\npi.registerTool({\n name: \"my_tool\",\n label: \"My Tool\",\n description: \"...\",\n parameters: z.object({}),\n async execute(_id, _params, signal, onUpdate, ctx) {\n if (signal?.aborted) {\n return { content: [{ type: \"text\", text: \"Cancelled\" }] };\n }\n onUpdate?.({ content: [{ type: \"text\", text: \"Working...\" }] });\n return { content: [{ type: \"text\", text: \"Done\" }], details: {} };\n },\n onSession(event, ctx) {\n // reason: start|switch|branch|tree|shutdown\n },\n renderCall(args, options, theme) {\n // optional TUI render\n },\n renderResult(result, options, theme, args) {\n // optional TUI render\n },\n});\n```\n\n`tool_call`/`tool_result` intercept all tools once the registry is wrapped in `sdk.ts`, including built-ins and extension/custom tools.\n\n## UI integration points\n\n`ctx.ui` implements the `ExtensionUIContext` interface. Support differs by mode.\n\n### Interactive mode (`extension-ui-controller.ts`)\n\nSupported:\n\n- dialogs: `select`, `confirm`, `input`, `editor`\n- notifications/status/editor text/terminal input/custom overlays\n- theme listing/loading by name (`setTheme` supports string names)\n- tools expanded toggle\n\nCurrent no-op methods in this controller:\n\n- `setFooter`\n- `setHeader`\n- `setEditorComponent`\n\nAlso note: `setWidget` currently routes to status-line text via `setHookWidget(...)`.\n\n### RPC mode (`rpc-mode.ts`)\n\n`ctx.ui` is backed by RPC `extension_ui_request` events:\n\n- dialog methods (`select`, `confirm`, `input`, `editor`) round-trip to client responses\n- fire-and-forget methods emit requests (`notify`, `setStatus`, `setWidget` for string arrays, `setTitle`, `setEditorText`)\n\nUnsupported/no-op in RPC implementation:\n\n- `onTerminalInput`\n- `custom`\n- `setFooter`, `setHeader`, `setEditorComponent`\n- `setWorkingMessage`\n- theme switching/loading (`setTheme` returns failure)\n- tool expansion controls are inert\n\n### Print/headless/subagent paths\n\nWhen no UI context is supplied to runner init, `ctx.hasUI` is `false` and methods are no-op/default-returning.\n\n### Background interactive mode\n\nBackground mode installs a non-interactive UI context object. In current implementation, `ctx.hasUI` may still be `true` while interactive dialogs return defaults/no-op behavior.\n\n## Session and state patterns\n\nFor durable extension state:\n\n1. Persist with `pi.appendEntry(customType, data)`.\n2. Rebuild state from `ctx.sessionManager.getBranch()` on `session_start`, `session_branch`, `session_tree`.\n3. Keep tool result `details` structured when state should be visible/reconstructible from tool result history.\n\nExample reconstruction pattern:\n\n```ts\npi.on(\"session_start\", async (_event, ctx) => {\n let latest;\n for (const entry of ctx.sessionManager.getBranch()) {\n if (entry.type === \"custom\" && entry.customType === \"my-state\") {\n latest = entry.data;\n }\n }\n // restore from latest\n});\n```\n\n## Rendering extension points\n\n## Custom message renderer\n\n```ts\npi.registerMessageRenderer(\"my-type\", (message, { expanded }, theme) => {\n // return pi-tui Component\n});\n```\n\nUsed by interactive rendering when custom messages are displayed.\n\n## Tool call/result renderer\n\nProvide `renderCall` / `renderResult` on `registerTool` definitions for custom tool visualization in TUI.\n\n## Constraints and pitfalls\n\n- Runtime actions are unavailable during extension load.\n- `tool_call` errors block execution (fail-closed).\n- Command name conflicts with built-ins are skipped with diagnostics.\n- Reserved shortcuts are ignored (`ctrl+c`, `ctrl+d`, `ctrl+z`, `ctrl+k`, `ctrl+p`, `ctrl+l`, `ctrl+o`, `ctrl+t`, `ctrl+g`, `shift+tab`, `shift+ctrl+p`, `alt+enter`, `escape`, `enter`).\n- Treat `ctx.reload()` as terminal for the current command handler frame.\n\n## Extensions vs hooks vs custom-tools\n\nUse the right surface:\n\n- **Extensions** (`src/extensibility/extensions/*`): unified system (events + tools + commands + renderers + provider registration).\n- **Hooks** (`src/extensibility/hooks/*`): separate legacy event API.\n- **Custom-tools** (`src/extensibility/custom-tools/*`): tool-focused modules; when loaded alongside extensions they are adapted and still pass through extension interception wrappers.\n\nIf you need one package that owns policy, tools, command UX, and rendering together, use extensions.\n", "fs-scan-cache-architecture.md": "# Filesystem Scan Cache Architecture Contract\n\nThis document defines the current contract for the shared filesystem scan cache implemented in Rust (`crates/pi-natives/src/fs_cache.rs`) and consumed by native discovery/search APIs exposed to `packages/coding-agent`.\n\n## What this cache is\n\nThe cache stores full directory-scan entry lists (`GlobMatch[]`) keyed by scan scope and traversal policy, then lets higher-level operations (glob filtering, fuzzy scoring, grep file selection) run against those cached entries.\n\nPrimary goals:\n\n- avoid repeated filesystem walks for repeated discovery/search calls\n- keep consistency across `glob`, `fuzzyFind`, and `grep` when they share the same scan policy\n- allow explicit staleness recovery for empty results and explicit invalidation after file mutations\n\n## Ownership and public surface\n\n- Cache implementation and policy: `crates/pi-natives/src/fs_cache.rs`\n- Native consumers:\n - `crates/pi-natives/src/glob.rs`\n - `crates/pi-natives/src/fd.rs` (`fuzzyFind`)\n - `crates/pi-natives/src/grep.rs`\n- JS binding/export:\n - `packages/natives/src/glob/index.ts` (`invalidateFsScanCache`)\n - `packages/natives/src/glob/types.ts`\n - `packages/natives/src/grep/types.ts`\n- Coding-agent mutation invalidation helpers:\n - `packages/coding-agent/src/tools/fs-cache-invalidation.ts`\n\n## Cache key partitioning (hard contract)\n\nEach entry is keyed by:\n\n- canonicalized `root` directory path\n- `include_hidden` boolean\n- `use_gitignore` boolean\n- `skip_node_modules` boolean\n\nImplications:\n\n- Hidden and non-hidden scans do **not** share entries.\n- Gitignore-respecting and ignore-disabled scans do **not** share entries.\n- Scans that prune `node_modules` do **not** share entries with scans that include it.\n- Consumers must pass stable semantics for hidden/gitignore/node_modules behavior; changing any flag creates a different cache partition.\n\n## Scan collection behavior\n\nCache population uses a deterministic walker (`ignore::WalkBuilder`) configured by `include_hidden`, `use_gitignore`, and `skip_node_modules`:\n\n- `follow_links(false)`\n- sorted by file path\n- `.git` is always skipped\n- `node_modules` is pruned at traversal time when `skip_node_modules=true`\n- entry file type + `mtime` are captured via `symlink_metadata`\n\nSearch roots are resolved by `resolve_search_path`:\n\n- relative paths are resolved against current cwd\n- target must be an existing directory\n- root is canonicalized when possible\n\n## Freshness and eviction policy\n\nGlobal policy (environment-overridable):\n\n- `FS_SCAN_CACHE_TTL_MS` (default `1000`)\n- `FS_SCAN_EMPTY_RECHECK_MS` (default `200`)\n- `FS_SCAN_CACHE_MAX_ENTRIES` (default `16`)\n\nBehavior:\n\n- `get_or_scan(...)`\n - if TTL is `0`: bypass cache entirely, always fresh scan (`cache_age_ms = 0`)\n - on cache hit within TTL: return cached entries + non-zero `cache_age_ms`\n - on expired hit: evict key, rescan, store fresh entry\n- max entry enforcement is oldest-first eviction by `created_at`\n\n## Empty-result fast recheck (separate from normal hits)\n\nNormal cache hit:\n\n- a cache hit inside TTL returns cached entries and does nothing else.\n\nEmpty-result fast recheck:\n\n- this is a **caller-side** policy using `ScanResult.cache_age_ms`\n- if filtered/query result is empty and cached scan age is at least `empty_recheck_ms()`, caller performs one `force_rescan(...)` and retries\n- intended to reduce stale-negative results when files were recently added but cache is still within TTL\n\nCurrent consumers:\n\n- `glob`: rechecks when filtered matches are empty and scan age exceeds threshold\n- `fuzzyFind` (`fd.rs`): rechecks only when query is non-empty and scored matches are empty\n- `grep`: rechecks when selected candidate file list is empty\n\n## Consumer defaults and cache usage\n\nCache is opt-in on all exposed APIs (`cache?: boolean`, default `false`).\n\nCurrent defaults in native APIs:\n\n- `glob`: `hidden=false`, `gitignore=true`, `cache=false`, and `node_modules` included only when the pattern mentions `node_modules`\n- `fuzzyFind`: `hidden=false`, `gitignore=true`, `cache=false`, and `node_modules` is skipped\n- `grep`: `hidden=true`, `gitignore=true`, `cache=false`, and `node_modules` included only when the glob mentions `node_modules`\n\nCoding-agent callers today:\n\n- High-volume mention candidate discovery enables cache:\n - `packages/coding-agent/src/utils/file-mentions.ts`\n - profile: `hidden=true`, `gitignore=true`, `includeNodeModules=true`, `cache=true`\n- Tool-level `grep` integration currently disables scan cache (`cache: false`):\n - `packages/coding-agent/src/tools/grep.ts`\n\n## Invalidation contract\n\nNative invalidation entrypoint:\n\n- `invalidateFsScanCache(path?: string)`\n - with `path`: remove cache entries whose root is a prefix of target path\n - without path: clear all scan cache entries\n\nPath handling details:\n\n- relative invalidation paths are resolved against cwd\n- invalidation attempts canonicalization\n- if target does not exist (e.g., delete), fallback canonicalizes parent and reattaches filename when possible\n- this preserves invalidation behavior for create/delete/rename where one side may not exist\n\n## Coding-agent mutation flow responsibilities\n\nCoding-agent code must invalidate after successful filesystem mutations.\n\nCentral helpers:\n\n- `invalidateFsScanAfterWrite(path)`\n- `invalidateFsScanAfterDelete(path)`\n- `invalidateFsScanAfterRename(oldPath, newPath)` (invalidates both sides when paths differ)\n\nCurrent mutation tool callsites:\n\n- `packages/coding-agent/src/tools/write.ts`\n- `packages/coding-agent/src/patch/index.ts` (hashline/patch/replace flows)\n\nRule: if a flow mutates filesystem content or location and bypasses these helpers, cache staleness bugs are expected.\n\n## Adding a new cache consumer safely\n\nWhen introducing cache use in a new scanner/search path:\n\n1. **Use stable scan policy inputs**\n - decide hidden/gitignore/node_modules semantics first\n - pass them consistently to `get_or_scan`/`force_rescan` so cache partitions are intentional\n\n2. **Treat cache data as pre-filtered only by traversal policy**\n - apply tool-specific filtering (glob patterns, type filters, scoring) after retrieval\n - never assume cached entries already reflect your higher-level filters\n\n3. **Implement empty-result fast recheck only for stale-negative risk**\n - use `scan.cache_age_ms >= empty_recheck_ms()`\n - retry once with `force_rescan(..., store=true, ...)`\n - keep this path separate from normal cache-hit logic\n\n4. **Respect no-cache mode explicitly**\n - when caller disables cache, call `force_rescan(..., store=false, ...)`\n - do not populate shared cache in a no-cache request path\n\n5. **Wire mutation invalidation for any new write path**\n - after successful write/edit/delete/rename, call the coding-agent invalidation helper\n - for rename/move, invalidate both old and new paths\n\n6. **Do not add per-call TTL knobs**\n - current contract is global policy only (env-configured), no per-request TTL override\n\n## Known boundaries\n\n- Cache scope is process-local in-memory (`DashMap`), not persisted across process restarts.\n- Cache stores scan entries, not final tool results.\n- `glob`/`fuzzyFind`/`grep` share scan entries only when key dimensions (`root`, `hidden`, `gitignore`, `skip_node_modules`) match.\n- `.git` is always excluded at scan collection time regardless of caller options.\n", "gemini-manifest-extensions.md": "# Gemini Manifest Extensions (`gemini-extension.json`)\n\nThis document covers how the coding-agent discovers and parses Gemini-style manifest extensions (`gemini-extension.json`) into the `extensions` capability.\n\nIt does **not** cover TypeScript/JavaScript extension module loading (`extensions/*.ts`, `index.ts`, `package.json omp.extensions`), which is documented in `extension-loading.md`.\n\n## Implementation files\n\n- [`../src/discovery/gemini.ts`](../packages/coding-agent/src/discovery/gemini.ts)\n- [`../src/discovery/builtin.ts`](../packages/coding-agent/src/discovery/builtin.ts)\n- [`../src/discovery/helpers.ts`](../packages/coding-agent/src/discovery/helpers.ts)\n- [`../src/capability/extension.ts`](../packages/coding-agent/src/capability/extension.ts)\n- [`../src/capability/index.ts`](../packages/coding-agent/src/capability/index.ts)\n- [`../src/extensibility/extensions/loader.ts`](../packages/coding-agent/src/extensibility/extensions/loader.ts)\n\n---\n\n## What gets discovered\n\nThe Gemini provider (`id: gemini`, priority `60`) registers an `extensions` loader that scans two fixed roots:\n\n- User: `~/.gemini/extensions`\n- Project: `/.gemini/extensions`\n\nPath resolution is direct from `ctx.home` and `ctx.cwd` via `getUserPath()` / `getProjectPath()`.\n\nImportant scope rule: project lookup is **cwd-only**. It does not walk parent directories.\n\n---\n\n## Directory scan rules\n\nFor each root (`~/.gemini/extensions` and `/.gemini/extensions`), discovery does:\n\n1. `readDirEntries(root)`\n2. keep only direct child directories (`entry.isDirectory()`)\n3. for each child ``, attempt to read exactly:\n - `//gemini-extension.json`\n\nThere is no recursive scan beyond one directory level.\n\n### Hidden directories\n\nGemini manifest discovery does **not** filter out dot-prefixed directory names. If a hidden child directory exists and contains `gemini-extension.json`, it is considered.\n\n### Missing/unreadable files\n\nIf `gemini-extension.json` is missing or unreadable, that directory is skipped silently (no warning).\n\n---\n\n## Manifest shape (as implemented)\n\nThe capability type defines this manifest shape:\n\n```ts\ninterface ExtensionManifest {\n name?: string;\n description?: string;\n mcpServers?: Record>;\n tools?: unknown[];\n context?: unknown;\n}\n```\n\nDiscovery-time behavior is intentionally loose:\n\n- JSON parse success is required.\n- There is no runtime schema validation for field types/content beyond JSON syntax.\n- The parsed object is stored as `manifest` on the capability item.\n\n### Name normalization\n\n`Extension.name` is set to:\n\n1. `manifest.name` if it is not `null`/`undefined`\n2. otherwise the extension directory name\n\nNo string-type enforcement is applied here.\n\n---\n\n## Materialization into capability items\n\nA valid parsed manifest creates one `Extension` capability item:\n\n```ts\n{\n\tname: manifest.name ?? ,\n\tpath: ,\n\tmanifest: ,\n\tlevel: \"user\" | \"project\",\n\t_source: {\n\t\tprovider: \"gemini\",\n\t\tproviderName: \"Gemini CLI\" // attached by capability registry\n\t\tpath: ,\n\t\tlevel: \"user\" | \"project\"\n\t}\n}\n```\n\nNotes:\n\n- `_source.path` is normalized to an absolute path by `createSourceMeta()`.\n- Registry-level capability validation for `extensions` only checks presence of `name` and `path`.\n- Manifest internals (`mcpServers`, `tools`, `context`) are not validated during discovery.\n\n---\n\n## Error handling and warning semantics\n\n### Warned\n\n- Invalid JSON in a manifest file:\n - warning format: `Invalid JSON in `\n\n### Not warned (silent skip)\n\n- `extensions` directory missing\n- child directory has no `gemini-extension.json`\n- unreadable manifest file\n- manifest JSON is syntactically valid but semantically odd/incomplete\n\nThis means partial validity is accepted: only syntactic JSON failure emits a warning.\n\n---\n\n## Precedence and deduplication with other sources\n\n`extensions` capability is aggregated across providers by the capability registry.\n\nCurrent providers for this capability:\n\n- `native` (`packages/coding-agent/src/discovery/builtin.ts`) priority `100`\n- `gemini` (`packages/coding-agent/src/discovery/gemini.ts`) priority `60`\n\nDedup key is `ext.name` (`extensionCapability.key = ext => ext.name`).\n\n### Cross-provider precedence\n\nHigher-priority provider wins on duplicate extension names.\n\n- If `native` and `gemini` both emit extension name `foo`, the native item is kept.\n- Lower-priority duplicate is retained only in `result.all` with `_shadowed = true`.\n\n### Intra-provider order effects\n\nBecause dedup is “first seen wins”, provider-local item order matters.\n\n- Gemini loader appends **user first**, then **project**.\n- Therefore, duplicate names between `~/.gemini/extensions` and `/.gemini/extensions` keep the user entry and shadow the project entry.\n\nBy contrast, native provider builds config dir order differently (`project` then `user` in `getConfigDirs()`), so native intra-provider shadowing is the opposite direction.\n\n---\n\n## User vs project behavior summary\n\nFor Gemini manifests specifically:\n\n- Both user and project roots are scanned every load.\n- Project root is fixed to `/.gemini/extensions` (no ancestor walk).\n- Duplicate names inside Gemini source resolve to user-first.\n- Duplicate names against higher-priority providers (notably native) lose by priority.\n\n---\n\n## Boundary: discovery metadata vs runtime extension loading\n\n`gemini-extension.json` discovery currently feeds capability metadata (`Extension` items). It does **not** directly load runnable TS/JS extension modules.\n\nRuntime module loading (`discoverAndLoadExtensions()` / `loadExtensions()`) uses `extension-modules` and explicit paths, and currently filters auto-discovered modules to provider `native` only.\n\nPractical implication:\n\n- Gemini manifest extensions are discoverable as capability records.\n- They are not, by themselves, executed as runtime extension modules by the extension loader pipeline.\n\nThis boundary is intentional in current implementation and explains why manifest discovery and executable module loading can diverge.\n", "handoff-generation-pipeline.md": "# `/handoff` generation pipeline\n\nThis document describes how the coding-agent implements `/handoff`: trigger path, oneshot generation, session switch, context reinjection, persistence, and UI behavior.\n\n## Scope\n\nCovers:\n\n- Interactive `/handoff` command dispatch\n- `AgentSession.handoff()` lifecycle and state transitions\n- `generateHandoff(...)` request shape\n- How old/new sessions persist handoff data differently\n- UI behavior for success, cancel, and failure\n\nDoes not cover:\n\n- Generic tree navigation/branch internals\n- Non-handoff session commands (`/new`, `/fork`, `/resume`)\n\n## Implementation files\n\n- [`../src/modes/controllers/input-controller.ts`](../packages/coding-agent/src/modes/controllers/input-controller.ts)\n- [`../src/modes/controllers/command-controller.ts`](../packages/coding-agent/src/modes/controllers/command-controller.ts)\n- [`../src/session/agent-session.ts`](../packages/coding-agent/src/session/agent-session.ts)\n- [`packages/agent/src/compaction/compaction.ts`](../packages/agent/src/compaction/compaction.ts)\n- [`../src/session/session-manager.ts`](../packages/coding-agent/src/session/session-manager.ts)\n- [`../src/extensibility/slash-commands.ts`](../packages/coding-agent/src/extensibility/slash-commands.ts)\n\n## Trigger path\n\n1. `/handoff` is declared in builtin slash command metadata (`slash-commands.ts`) with optional inline hint: `[focus instructions]`.\n2. In interactive input handling (`InputController`), submit text matching `/handoff` or `/handoff ...` is intercepted before normal prompt submission.\n3. The editor is cleared and `handleHandoffCommand(customInstructions?)` is called.\n4. `CommandController.handleHandoffCommand` performs a preflight guard using current entries:\n - Counts `type === \"message\"` entries.\n - If `< 2`, it warns: `Nothing to hand off (no messages yet)` and returns.\n\nThe same minimum-content guard exists again inside `AgentSession.handoff()` and throws if violated. This duplicates safety at both UI and session layers.\n\n## End-to-end lifecycle\n\n### 1) Start handoff generation\n\n`AgentSession.handoff(customInstructions?)`:\n\n- Reads current branch entries (`sessionManager.getBranch()`).\n- Validates minimum message count (`>= 2`).\n- Creates `#handoffAbortController` and links any caller-provided abort signal to it.\n- Resolves the current model API key through `ModelRegistry`.\n- Calls `generateHandoff(...)` with:\n - live agent messages (`agent.state.messages`),\n - the current model and API key,\n - the base system prompt (`#baseSystemPrompt`),\n - the live tool array (`agent.state.tools`),\n - optional focus instructions,\n - coding-agent message conversion (`convertToLlm`),\n - provider metadata and `initiatorOverride: \"agent\"`.\n\n`generateHandoff(...)` lives in `packages/agent/src/compaction/compaction.ts` next to summarization. It renders `packages/agent/src/compaction/prompts/handoff-document.md` via `renderHandoffPrompt(...)` with optional `additionalFocus`.\n\n### 2) Generate and capture output\n\n`generateHandoff(...)` converts the existing `AgentMessage[]` history to real LLM `Message[]` history, then appends one trailing agent-attributed `user` message containing the rendered handoff prompt.\n\nThe request uses `completeSimple(...)` directly:\n\n```ts\nawait completeSimple(\n model,\n {\n systemPrompt,\n messages: requestMessages,\n tools,\n },\n {\n apiKey,\n signal,\n reasoning: Effort.High,\n toolChoice: \"none\",\n initiatorOverride,\n metadata,\n },\n);\n```\n\nImportant generation properties:\n\n- The request preserves the live provider cache prefix by reusing the same system prompt, tool definitions, and real message history shape as the active agent.\n- The handoff instruction is a trailing `user` message, not a developer message, so the cached prefix remains aligned with the prior turn.\n- `toolChoice: \"none\"` prevents intentional tool dispatch.\n- The returned assistant content is filtered to text blocks and joined with `\\n`; stray tool-call blocks are ignored if a provider does not honor `toolChoice: \"none\"`.\n- `stopReason === \"error\"` throws a generation error.\n\nNo agent-loop events are used for capture. The handoff path no longer waits for `agent_end` and no longer scans the latest assistant message.\n\n### 3) Cancellation checks\n\nCancellation throws `Error(\"Handoff cancelled\")`; a completed generation with no text returns `undefined`.\n\n- caller signal aborts `#handoffAbortController`\n- `completeSimple(...)` receives the abort signal\n- aborted handoff signal or provider `AbortError` is normalized to `Error(\"Handoff cancelled\")`\n- empty generated text returns `undefined`\n\n`AgentSession.handoff()` always clears `#handoffAbortController` in `finally`.\n\n### 4) New session creation\n\nIf text was generated and not aborted:\n\n1. Flush current session writer (`sessionManager.flush()`).\n2. Cancel session-owned async jobs.\n3. Start a brand-new session with `parentSession` pointing at the previous session file when one exists.\n4. Reset in-memory agent state (`agent.reset()`).\n5. Rebind `agent.sessionId` to the new session id.\n6. Rekey/reset hindsight state for the new session.\n7. Clear queued context arrays (`#steeringMessages`, `#followUpMessages`, `#pendingNextTurnMessages`) and any scheduled hidden next-turn generation.\n8. Reset todo reminder counter.\n\n### 5) Handoff-context injection\n\nThe generated handoff document is wrapped by coding-agent session glue and appended to the new session as a `custom_message` entry:\n\n```text\n\n...handoff text...\n\n\nThe above is a handoff document from a previous session. Use this context to continue the work seamlessly.\n```\n\nInsertion call:\n\n```ts\nthis.sessionManager.appendCustomMessageEntry(\"handoff\", handoffContent, true, undefined, \"agent\");\n```\n\nSemantics:\n\n- `customType`: `\"handoff\"`\n- `display`: `true` (visible in TUI rebuild)\n- attribution: `\"agent\"`\n- Entry type: `custom_message` (participates in LLM context)\n\n### 6) Rebuild active agent context\n\nAfter injection:\n\n1. `buildDisplaySessionContext()` resolves message list for current leaf.\n2. `agent.replaceMessages(sessionContext.messages)` makes the injected handoff message active context.\n3. Todo phases are synchronized from the new branch.\n4. Method returns `{ document: handoffText, savedPath? }`.\n\nAt this point, the active LLM context in the new session contains the injected handoff message, not the old transcript.\n\n## Persistence model: old session vs new session\n\n### Old session\n\nHandoff generation is a oneshot request, not a visible agent turn. The generated handoff text is not appended to the old session as an assistant message.\n\nResult: the original session keeps its prior transcript unchanged except for data already persisted before handoff began.\n\n### New session\n\nAfter session reset, handoff is persisted as `custom_message` with `customType: \"handoff\"`.\n\n`buildSessionContext()` converts this entry into a runtime custom/user-context message via `createCustomMessage(...)`, so it is included in future prompts from the new session.\n\nAuto-triggered handoffs can additionally write a timestamped `handoff-*.md` artifact under the session artifacts directory when `compaction.handoffSaveToDisk` is enabled. Manual `/handoff` does not write that artifact.\n\n## Controller/UI behavior\n\n`CommandController.handleHandoffCommand` behavior:\n\n- Shows a status loader: `Generating handoff… (esc to cancel)`.\n- Calls `await session.handoff(customInstructions)`.\n- If result is `undefined`: `showError(\"Handoff cancelled\")`.\n- On success:\n - `rebuildChatFromMessages()` (loads new session context, including injected handoff)\n - invalidates status line and editor top border\n - reloads todos\n - appends success chat line: `New session started with handoff context`\n- On exception:\n - if message is `\"Handoff cancelled\"` or error name is `AbortError`: `showError(\"Handoff cancelled\")`\n - otherwise: `showError(\"Handoff failed: \")`\n- Stops the loader, restores the previous Escape handler, and requests render at end.\n\nManual `/handoff` no longer streams the generated document into chat. A cancellable loader remains visible while the oneshot request runs, and the chat is rebuilt after generation completes.\n\n## Cancellation semantics\n\n### Session-level cancellation primitive\n\n`AgentSession` exposes:\n\n- `abortHandoff()` → aborts `#handoffAbortController`\n- `isGeneratingHandoff` → true while controller exists\n\nWhen this abort path is used, the abort signal is passed to `completeSimple(...)`; `handoff()` normalizes the cancellation to `Error(\"Handoff cancelled\")`, and command controller maps it to cancellation UI.\n\n### Interactive `/handoff` path\n\nThe command controller installs a temporary Escape handler for `/handoff` while the loader is visible. Pressing Escape calls `session.abortHandoff()`, which aborts the `completeSimple(...)` request through `#handoffAbortController`.\n\n## Aborted vs failed handoff\n\nCurrent UI classification:\n\n- **Aborted/cancelled**\n - `abortHandoff()` path triggers `\"Handoff cancelled\"`, or\n - thrown `AbortError`\n - UI shows `Handoff cancelled`\n- **Failed**\n - any other thrown error from `handoff()` / `generateHandoff()` / provider request path\n - UI shows `Handoff failed: ...`\n\nAdditional nuance: if generation completes but no text is returned, `handoff()` returns `undefined` and controller currently reports **cancelled**, not **failed**.\n\n## Short-session and minimum-content guardrails\n\nTwo guards prevent low-signal handoffs:\n\n- UI layer (`handleHandoffCommand`): warns and returns early for `< 2` message entries\n- Session layer (`handoff()`): throws the same condition as an error\n\nThis avoids creating a new session with empty/near-empty handoff context.\n\n## State transition summary\n\nHigh-level state flow:\n\n1. Interactive slash command intercepted.\n2. Preflight message-count guard.\n3. `#handoffAbortController` created (`isGeneratingHandoff = true`).\n4. `generateHandoff(...)` issues one `completeSimple(...)` request with live system prompt, tools, message history, and trailing handoff prompt.\n5. Assistant response text blocks are joined; tool-call blocks are discarded.\n6. If missing text → return `undefined`; if aborted → cancellation error path.\n7. If present:\n - flush old session\n - cancel async jobs\n - create new empty session with previous session as parent\n - reset runtime queues/counters\n - append `custom_message(handoff)`\n - optionally save an auto-triggered handoff document under the session artifacts directory when `compaction.handoffSaveToDisk` is enabled\n8. Controller rebuilds chat UI and announces success.\n9. `#handoffAbortController` cleared (`isGeneratingHandoff = false`).\n\n## Known assumptions and limitations\n\n- No structural validation checks that generated markdown follows the requested section format.\n- Missing generated text is reported as cancellation in controller UX.\n- Manual handoff has no streaming visibility; a cancellable loader is shown until the UI updates after generation completes.\n- Auto-triggered handoffs can write a timestamped `handoff-*.md` artifact when `compaction.handoffSaveToDisk` is enabled; write failure is logged and does not fail the handoff.\n", "hooks.md": "# Hooks\n\nThis document describes the **current hook subsystem code** in `src/extensibility/hooks/*`.\n\n## Current status in runtime\n\nThe hook package (`src/extensibility/hooks/`) is still exported and usable as an API surface, but the default CLI runtime now initializes the **extension runner** path. In current startup flow:\n\n- `--hook` is treated as an alias for `--extension` (CLI paths are merged into `additionalExtensionPaths`)\n- tools are wrapped by `ExtensionToolWrapper`, not `HookToolWrapper`\n- context transforms and lifecycle emissions go through `ExtensionRunner`\n\nSo this file documents the hook subsystem implementation itself (types/loader/runner/wrapper), including legacy behavior and constraints.\n\n## Key files\n\n- `src/extensibility/hooks/types.ts` — hook context, event types, and result contracts\n- `src/extensibility/hooks/loader.ts` — module loading and hook discovery bridge\n- `src/extensibility/hooks/runner.ts` — event dispatch, command lookup, error signaling\n- `src/extensibility/hooks/tool-wrapper.ts` — pre/post tool interception wrapper\n- `src/extensibility/hooks/index.ts` — exports/re-exports\n\n## What a hook module is\n\nA hook module must default-export a factory:\n\n```ts\nimport type { HookAPI } from \"@oh-my-pi/pi-coding-agent/extensibility/hooks\";\n\nexport default function hook(pi: HookAPI): void {\n pi.on(\"tool_call\", async (event, ctx) => {\n if (\n event.toolName === \"bash\" &&\n String(event.input.command ?? \"\").includes(\"rm -rf\")\n ) {\n return { block: true, reason: \"blocked by policy\" };\n }\n });\n}\n```\n\nThe factory can:\n\n- register event handlers with `pi.on(...)`\n- send persistent custom messages with `pi.sendMessage(...)`\n- persist non-LLM state with `pi.appendEntry(...)`\n- register slash commands via `pi.registerCommand(...)`\n- register custom message renderers via `pi.registerMessageRenderer(...)`\n- run shell commands via `pi.exec(...)`\n\n## Discovery and loading\n\n`discoverAndLoadHooks(configuredPaths, cwd)` does:\n\n1. Load discovered hooks from capability registry (`loadCapability(\"hooks\")`)\n2. Append explicitly configured paths (deduped by absolute path)\n3. Call `loadHooks(allPaths, cwd)`\n\n`loadHooks` then imports each path and expects a `default` function.\n\n### Path resolution\n\n`loader.ts` resolves hook paths as:\n\n- absolute path: used as-is\n- `~` path: expanded\n- relative path: resolved against `cwd`\n\n### Important legacy mismatch\n\nDiscovery providers for `hookCapability` still model pre/post shell-style hook files (for example `.claude/hooks/pre/*`, `.omp/.../hooks/pre/*`).\n\nThe hook loader here uses dynamic module import and requires a default JS/TS hook factory. If a discovered hook path is not importable as a module, load fails and is reported in `LoadHooksResult.errors`.\n\n## Event surfaces\n\nHook events are strongly typed in `types.ts`.\n\n### Session events\n\n- `session_start`\n- `session_before_switch` → can return `{ cancel?: boolean }`\n- `session_switch`\n- `session_before_branch` → can return `{ cancel?: boolean; skipConversationRestore?: boolean }`\n- `session_branch`\n- `session_before_compact` → can return `{ cancel?: boolean; compaction?: CompactionResult }`\n- `session.compacting` → can return `{ context?: string[]; prompt?: string; preserveData?: Record }`\n- `session_compact`\n- `session_before_tree` → can return `{ cancel?: boolean; summary?: { summary: string; details?: unknown } }`\n- `session_tree`\n- `session_shutdown`\n\n### Agent/context events\n\n- `context` → can return `{ messages?: Message[] }`\n- `before_agent_start` → can return `{ message?: { customType; content; display; details } }`\n- `agent_start`\n- `agent_end`\n- `turn_start`\n- `turn_end`\n- `auto_compaction_start`\n- `auto_compaction_end`\n- `auto_retry_start`\n- `auto_retry_end`\n- `ttsr_triggered`\n- `todo_reminder`\n\n### Tool events (pre/post model)\n\n- `tool_call` (pre-execution) → can return `{ block?: boolean; reason?: string }`\n- `tool_result` (post-execution) → can return `{ content?; details?; isError? }`\n\nThis is the hook subsystem’s core pre/post interception model.\n\n```text\nHook tool interception flow\n\ntool_call handlers\n │\n ├─ any { block: true }? ── yes ──> throw (tool blocked)\n │\n └─ no\n │\n ▼\n execute underlying tool\n │\n ├─ success ──> tool_result handlers can override { content, details }\n │\n └─ error ──> emit tool_result(isError=true) then rethrow original error\n```\n\n## Execution model and mutation semantics\n\n### 1) Pre-execution: `tool_call`\n\n`HookToolWrapper.execute()` emits `tool_call` before tool execution.\n\n- if any handler returns `{ block: true }`, execution stops\n- if handler throws, wrapper fails closed and blocks execution\n- returned `reason` becomes the thrown error text\n\n### 2) Tool execution\n\nUnderlying tool executes normally if not blocked.\n\n### 3) Post-execution: `tool_result`\n\nAfter success, wrapper emits `tool_result` with:\n\n- `toolName`, `toolCallId`, `input`\n- `content`\n- `details`\n- `isError: false`\n\nIf handler returns overrides:\n\n- `content` can replace result content\n- `details` can replace result details\n\nOn tool failure, wrapper emits `tool_result` with `isError: true` and error text content, then rethrows original error.\n\n### What hooks can mutate\n\n- LLM context for a single call via `context` (`messages` replacement chain)\n- tool output content/details on successful tool calls (`tool_result` path)\n- pre-agent injected message via `before_agent_start`\n- cancellation/custom compaction/tree behavior via `session_before_*` and `session.compacting`\n\n### What hooks cannot mutate in this implementation\n\n- raw tool input parameters in-place (only block/allow on `tool_call`)\n- execution continuation after thrown tool errors (error path rethrows)\n- final success/error status in wrapper behavior (returned `isError` is typed but not applied by `HookToolWrapper`)\n\n## Ordering and conflict behavior\n\n### Discovery-level ordering\n\nCapability providers are priority-sorted (higher first). Dedupe is by capability key, first wins.\n\nFor `hooks`, capability key is `${type}:${tool}:${name}`. Shadowed duplicates from lower-priority providers are marked and excluded from effective discovered list.\n\n### Load order\n\n`discoverAndLoadHooks` builds a flat `allPaths` list, deduped by resolved absolute path, then `loadHooks` iterates in that order.\nFile order within each discovered directory depends on `readdir` output; the hook loader does not perform an additional sort.\n\n### Runtime handler order\n\nInside `HookRunner`, order is deterministic by registration sequence:\n\n1. hooks array order\n2. handler registration order per hook/event\n\nConflict behavior by event type:\n\n- `tool_call`: last returned result wins unless a handler blocks; first block short-circuits\n- `tool_result`: last returned override wins (no short-circuit)\n- `context`: chained; each handler receives prior handler’s message output\n- `before_agent_start`: first returned message is kept; later messages ignored\n- `session_before_*`: latest returned result is tracked; `cancel: true` short-circuits immediately\n- `session.compacting`: latest returned result wins\n\nCommand/renderer conflicts:\n\n- `getCommand(name)` returns first match across hooks (first loaded wins)\n- `getMessageRenderer(customType)` returns first match\n- `getRegisteredCommands()` returns all commands (no dedupe)\n\n## UI interactions (`HookContext.ui`)\n\n`HookUIContext` includes:\n\n- `select`, `confirm`, `input`, `editor`\n- `notify`\n- `setStatus`\n- `custom`\n- `setEditorText`, `getEditorText`\n- `theme` getter\n\n`ctx.hasUI` indicates whether interactive UI is available.\n\nWhen running with no UI, the default no-op context behavior is:\n\n- `select/input/editor` return `undefined`\n- `confirm` returns `false`\n- `notify`, `setStatus`, `setEditorText` are no-ops\n- `getEditorText` returns `\"\"`\n\n### Status line behavior\n\nHook status text set via `ctx.ui.setStatus(key, text)` is:\n\n- stored per key\n- sorted by key name\n- sanitized (`\\r`, `\\n`, `\\t` → spaces; repeated spaces collapsed)\n- joined and width-truncated for display\n\n## Error propagation and fallback\n\n### Load-time\n\n- invalid module or missing default export → captured in `LoadHooksResult.errors`\n- loading continues for other hooks\n\n### Event-time\n\n`HookRunner.emit(...)` catches handler errors for most events and emits `HookError` to listeners (`hookPath`, `event`, `error`), then continues.\n\n`emitToolCall(...)` is stricter: handler errors are not swallowed there; they propagate to caller. In `HookToolWrapper`, this blocks the tool call (fail-safe).\n\n## Realistic API examples\n\n### Block unsafe bash commands\n\n```ts\nimport type { HookAPI } from \"@oh-my-pi/pi-coding-agent/extensibility/hooks\";\n\nexport default function (pi: HookAPI): void {\n pi.on(\"tool_call\", async (event, ctx) => {\n if (event.toolName !== \"bash\") return;\n const cmd = String(event.input.command ?? \"\");\n if (!cmd.includes(\"rm -rf\")) return;\n\n if (!ctx.hasUI) return { block: true, reason: \"rm -rf blocked (no UI)\" };\n const ok = await ctx.ui.confirm(\"Dangerous command\", `Allow: ${cmd}`);\n if (!ok) return { block: true, reason: \"user denied command\" };\n });\n}\n```\n\n### Redact tool output on post-execution\n\n```ts\nimport type { HookAPI } from \"@oh-my-pi/pi-coding-agent/extensibility/hooks\";\n\nexport default function (pi: HookAPI): void {\n pi.on(\"tool_result\", async (event) => {\n if (event.toolName !== \"read\" || event.isError) return;\n\n const redacted = event.content.map((chunk) => {\n if (chunk.type !== \"text\") return chunk;\n return {\n ...chunk,\n text: chunk.text.replaceAll(/API_KEY=\\S+/g, \"API_KEY=[REDACTED]\"),\n };\n });\n\n return { content: redacted };\n });\n}\n```\n\n### Modify model context per LLM call\n\n```ts\nimport type { HookAPI } from \"@oh-my-pi/pi-coding-agent/extensibility/hooks\";\n\nexport default function (pi: HookAPI): void {\n pi.on(\"context\", async (event) => {\n const filtered = event.messages.filter(\n (msg) => !(msg.role === \"custom\" && msg.customType === \"debug-only\"),\n );\n return { messages: filtered };\n });\n}\n```\n\n### Register slash command with command-safe context methods\n\n```ts\nimport type { HookAPI } from \"@oh-my-pi/pi-coding-agent/extensibility/hooks\";\n\nexport default function (pi: HookAPI): void {\n pi.registerCommand(\"handoff\", {\n description: \"Create a new session with setup message\",\n handler: async (_args, ctx) => {\n await ctx.waitForIdle();\n await ctx.newSession({\n parentSession: ctx.sessionManager.getSessionFile(),\n setup: async (sm) => {\n sm.appendMessage({\n role: \"user\",\n content: [\n { type: \"text\", text: \"Continue from prior session summary.\" },\n ],\n timestamp: Date.now(),\n });\n },\n });\n },\n });\n}\n```\n\n## Export surface\n\n`src/extensibility/hooks/index.ts` and the package subpath `@oh-my-pi/pi-coding-agent/extensibility/hooks` export:\n\n- loading APIs (`discoverAndLoadHooks`, `loadHooks`)\n- runner and wrapper (`HookRunner`, `HookToolWrapper`)\n- all hook types\n- `execCommand` re-export\n\nThe package root (`@oh-my-pi/pi-coding-agent`) does not re-export `HookAPI`; import legacy hook types from the hooks subpath.\n", "install-id.md": "# Install ID\n\nA persistent per-install UUID that identifies a single oh-my-pi installation across sessions. Used as a stable correlation key for server-side dedup of telemetry-style pushes (currently the auto-QA grievance flush from `report_tool_issue`).\n\n## API\n\nExported from `@oh-my-pi/pi-utils` (`packages/utils/src/dirs.ts`):\n\n| Symbol | Purpose |\n| --- | --- |\n| `getInstallId(): string` | Returns the install ID, generating and persisting one on first call. Result is cached in-process for the lifetime of the runtime. |\n| `__resetInstallIdCacheForTests(): void` | Clears the in-process cache. Test-only — MUST NOT be called from production code. |\n\nThe returned value is a canonical lowercase RFC 4122 UUID matching `^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$`.\n\n## Storage\n\n- Path: `/install-id` — i.e. `~/.omp/install-id` by default, respecting `PI_CONFIG_DIR` via `getConfigRootDir()`.\n- Format: a single UUID line (trailing `\\n`).\n- Permissions: file is created with mode `0o600`.\n- Lifecycle: independent of `~/.omp/agent/`. Wiping agent state (sessions, settings, DB) does NOT regenerate the install ID; only deleting the `install-id` file itself does.\n\n## Generation and lifecycle\n\n1. First call to `getInstallId()` reads the file. If contents parse as a valid UUID, that value is cached and returned.\n2. Otherwise the helper calls `crypto.randomUUID()` (Node's CSPRNG-backed UUID v4) to mint a new ID.\n3. The new value is written via `open(O_WRONLY | O_CREAT | O_EXCL, 0o600)`. The exclusive-create guard means two processes hitting first-call simultaneously cannot both succeed — the loser sees `EEXIST`, re-reads the winner's file, and adopts that ID.\n4. If the existing file contained non-empty garbage (failed UUID regex), it is `unlink`ed before the exclusive create so `O_EXCL` does not trip on stale data.\n5. Any other write failure (read-only FS, permission error) is swallowed: the freshly generated UUID is still cached in-memory so the rest of the process sees a stable value, and subsequent process launches will retry persistence.\n6. Subsequent in-process calls return the cached value without touching disk. Mutating the file on disk after the first call has no effect until the process restarts (or tests call `__resetInstallIdCacheForTests`).\n\n## Consumers\n\n- `packages/coding-agent/src/tools/report-tool-issue.ts` — included as `installId` in the auto-QA grievance push body so the backend can deduplicate repeated reports from the same install. See `dev.autoqaPush.*` settings and `PI_AUTO_QA_PUSH_*` env vars.\n\nNew consumers MUST treat the value as opaque and MUST NOT derive PII from it; the helper does not mix in hostname, username, or any other host-identifying entropy.\n\n## See also\n\n- [environment-variables.md](environment-variables.md) — `PI_CONFIG_DIR` controls where `install-id` lives.\n- [config-usage.md](config-usage.md) — broader config-root layout.\n", "marketplace.md": "# Marketplace plugin system\n\nThe marketplace system lets you discover, install, and manage plugins from Git-hosted catalogs. It is compatible with the Claude Code plugin registry format.\n\n## Quick start\n\n```\n/marketplace add anthropics/claude-plugins-official\n/marketplace install wordpress.com@claude-plugins-official\n```\n\nOr just type `/marketplace` with no arguments to open the interactive plugin browser.\n\n## Concepts\n\nA **marketplace** is a Git repository (or local directory) containing a catalog file at `.claude-plugin/marketplace.json`. The catalog lists available plugins with their sources, descriptions, and metadata.\n\nA **plugin** is a directory containing skills, commands, hooks, MCP servers, or LSP servers. Plugins are identified by `name@marketplace` (e.g. `code-review@claude-plugins-official`).\n\n**Scopes**: plugins can be installed at two scopes:\n\n- **user** (default) -- available in all projects, stored in `~/.omp/plugins/installed_plugins.json`\n- **project** -- available only in the current project, stored in `.omp/plugins/installed_plugins.json`\n\nProject-scoped installs shadow user-scoped installs of the same plugin.\n\n## Commands\n\n### Interactive mode\n\n| Command | Effect |\n| -------------- | ----------------------------------------- |\n| `/marketplace` | Open interactive plugin browser (install) |\n\n### Marketplace management\n\n| Command | Effect |\n| ---------------------------- | -------------------------------------------- |\n| `/marketplace add ` | Add a marketplace source |\n| `/marketplace remove ` | Remove a marketplace |\n| `/marketplace update [name]` | Re-fetch catalog(s); omit name to update all |\n| `/marketplace list` | List configured marketplaces |\n\n### Plugin operations\n\n| Command | Effect |\n| ------------------------------------------------------------------------- | ---------------------------------- |\n| `/marketplace discover [marketplace]` | Browse available plugins |\n| `/marketplace install [--force] [--scope user\\|project] name@marketplace` | Install a plugin |\n| `/marketplace uninstall [--scope user\\|project] name@marketplace` | Uninstall a plugin |\n| `/marketplace installed` | List installed marketplace plugins |\n| `/marketplace upgrade [--scope user\\|project] [name@marketplace]` | Upgrade one or all plugins |\n\n### CLI equivalents\n\nThe same operations are available from the command line:\n\n```\nomp plugin marketplace add \nomp plugin marketplace remove \nomp plugin marketplace update [name]\nomp plugin marketplace list\nomp plugin discover [marketplace]\nomp plugin install [--force] [--scope user|project] name@marketplace\nomp plugin uninstall [--scope user|project] name@marketplace\nomp plugin upgrade [--scope user|project] [name@marketplace]\n```\n\n## Marketplace sources\n\nWhen you run `/marketplace add `, the system classifies the source:\n\n| Source format | Type | Example |\n| ------------------------------- | ------------------ | -------------------------------------- |\n| `owner/repo` | GitHub shorthand | `anthropics/claude-plugins-official` |\n| `https://...*.json` | Direct catalog URL | `https://example.com/marketplace.json` |\n| `https://...*.git` or `git@...` | Git repository | `https://github.com/org/repo.git` |\n| `./path` or `~/path` or `/path` | Local directory | `./my-marketplace` |\n\nThe system clones the repository (or reads the local directory), locates `.claude-plugin/marketplace.json`, validates it, and caches the catalog locally.\n\n## Catalog format (marketplace.json)\n\nA marketplace catalog lives at `.claude-plugin/marketplace.json` in the repository root:\n\n```json\n{\n \"$schema\": \"https://anthropic.com/claude-code/marketplace.schema.json\",\n \"name\": \"my-marketplace\",\n \"owner\": {\n \"name\": \"Your Name\",\n \"email\": \"you@example.com\"\n },\n \"description\": \"A collection of plugins\",\n \"plugins\": [\n {\n \"name\": \"my-plugin\",\n \"description\": \"What this plugin does\",\n \"source\": \"./plugins/my-plugin\",\n \"category\": \"development\",\n \"homepage\": \"https://github.com/you/my-plugin\"\n }\n ]\n}\n```\n\n### Required fields\n\n| Field | Description |\n| ------------ | ---------------------------------------------------------------------------------------------------------------- |\n| `name` | Marketplace name. Lowercase alphanumeric, hyphens, and dots. Must start and end with alphanumeric. Max 64 chars. |\n| `owner.name` | Marketplace owner name |\n| `plugins` | Array of plugin entries |\n\n### Plugin entry fields\n\n| Field | Required | Description |\n| ------------- | -------- | ---------------------------------------------------------------- |\n| `name` | yes | Plugin name (same rules as marketplace name) |\n| `source` | yes | Where to find the plugin (see below) |\n| `description` | no | Short description |\n| `version` | no | Version string |\n| `author` | no | `{ name, email? }` |\n| `homepage` | no | URL |\n| `category` | no | Category string (e.g. `development`, `productivity`, `security`) |\n| `tags` | no | Array of string tags |\n| `strict` | no | Boolean |\n| `commands` | no | Slash commands provided |\n| `agents` | no | Agents provided |\n| `hooks` | no | Hook definitions |\n| `mcpServers` | no | MCP server definitions |\n| `lspServers` | no | LSP server definitions |\n\n### Plugin source formats\n\nThe `source` field supports several formats:\n\n**Relative path** (within the marketplace repo):\n\n```json\n\"source\": \"./plugins/my-plugin\"\n```\n\n**Git repository URL**:\n\n```json\n\"source\": {\n \"source\": \"url\",\n \"url\": \"https://github.com/org/repo.git\",\n \"sha\": \"abc123...\"\n}\n```\n\n**GitHub shorthand**:\n\n```json\n\"source\": {\n \"source\": \"github\",\n \"repo\": \"org/repo\",\n \"ref\": \"main\",\n \"sha\": \"abc123...\"\n}\n```\n\n**Git subdirectory** (monorepo):\n\n```json\n\"source\": {\n \"source\": \"git-subdir\",\n \"url\": \"https://github.com/org/monorepo.git\",\n \"path\": \"plugins/my-plugin\",\n \"ref\": \"main\",\n \"sha\": \"abc123...\"\n}\n```\n\n**npm package**:\n\n```json\n\"source\": {\n \"source\": \"npm\",\n \"package\": \"@scope/my-plugin\",\n \"version\": \"1.0.0\"\n}\n```\n\n## On-disk layout\n\n```\n~/.omp/\n marketplaces.json # Registry of added marketplaces\n plugins/\n installed_plugins.json # User-scoped installed plugins\n cache/\n marketplaces/ # Cached marketplace catalogs\n plugins/ # Cached plugin directories\n\n/.omp/\n plugins/\n installed_plugins.json # Project-scoped installed plugins\n```\n\n## Naming rules\n\nMarketplace and plugin names must:\n\n- Start and end with a lowercase letter or digit\n- Contain only lowercase letters, digits, hyphens, and dots\n- Be at most 64 characters\n\nPlugin IDs (`name@marketplace`) must be at most 128 characters total.\n\nValid examples: `my-plugin`, `code-review`, `wordpress.com`, `ai-firstify`\nInvalid examples: `-bad`, `bad-`, `.bad`, `Bad`, `under_score`\n", "mcp-config.md": "# MCP configuration in OMP\n\nThis guide explains how to add, edit, and validate MCP servers for the OMP coding agent.\n\nSource of truth in code:\n\n- Runtime config types: `packages/coding-agent/src/mcp/types.ts`\n- Config writer: `packages/coding-agent/src/mcp/config-writer.ts`\n- Loader + validation: `packages/coding-agent/src/mcp/config.ts`\n- Standalone `mcp.json` discovery: `packages/coding-agent/src/discovery/mcp-json.ts`\n- Schema: `packages/coding-agent/src/config/mcp-schema.json`\n\n## Preferred config locations\n\nOMP can discover MCP servers from multiple tools (`.claude/`, `.cursor/`, `.vscode/`, `opencode.json`, and more), but for OMP-native configuration you should usually use one of these files:\n\n- Project: `.omp/mcp.json`\n- User: `~/.omp/agent/mcp.json`\n\nOMP also accepts fallback standalone files in the project root:\n\n- `mcp.json`\n- `.mcp.json`\n\nUse `.omp/mcp.json` or `~/.omp/agent/mcp.json` when you want OMP to own the configuration. Use root `mcp.json` / `.mcp.json` only when you want a portable fallback file that other MCP clients may also read.\n\n## Add a schema reference\n\nAdd this line at the top of the file for editor autocomplete and validation:\n\n```json\n{\n \"$schema\": \"https://raw.githubusercontent.com/can1357/oh-my-pi/main/packages/coding-agent/src/config/mcp-schema.json\",\n \"mcpServers\": {}\n}\n```\n\nOMP now writes this automatically when `/mcp add`, `/mcp enable`, `/mcp disable`, `/mcp reauth`, or other config-writing flows create or update an OMP-managed MCP file.\n\n## File shape\n\nOMP supports this top-level structure:\n\n```json\n{\n \"$schema\": \"https://raw.githubusercontent.com/can1357/oh-my-pi/main/packages/coding-agent/src/config/mcp-schema.json\",\n \"mcpServers\": {\n \"server-name\": {\n \"type\": \"stdio\",\n \"command\": \"npx\",\n \"args\": [\"-y\", \"some-mcp-server\"]\n }\n },\n \"disabledServers\": [\"server-name\"]\n}\n```\n\nTop-level keys:\n\n- `$schema` — optional JSON Schema URL for tooling\n- `mcpServers` — map of server name to server config\n- `disabledServers` — user-level denylist used to turn off discovered servers by name; runtime loading reads this list from `~/.omp/agent/mcp.json`\n\nServer names must match `^[a-zA-Z0-9_.-]{1,100}$`.\n\n## Supported server fields\n\nShared fields for every transport:\n\n- `enabled?: boolean` — skip this server when `false`\n- `timeout?: number` — connection timeout in milliseconds\n- `auth?: { ... }` — auth metadata used by OMP for OAuth/API-key flows\n- `oauth?: { ... }` — explicit OAuth client settings used during auth/reauth\n\n### `stdio` transport\n\n`stdio` is the default when `type` is omitted.\n\nRequired:\n\n- `command: string`\n\nOptional:\n\n- `type?: \"stdio\"`\n- `args?: string[]`\n- `env?: Record`\n- `cwd?: string`\n\nExample:\n\n```json\n{\n \"$schema\": \"https://raw.githubusercontent.com/can1357/oh-my-pi/main/packages/coding-agent/src/config/mcp-schema.json\",\n \"mcpServers\": {\n \"filesystem\": {\n \"command\": \"npx\",\n \"args\": [\n \"-y\",\n \"@modelcontextprotocol/server-filesystem\",\n \"/Users/alice/projects\",\n \"/Users/alice/Documents\"\n ]\n }\n }\n}\n```\n\nThis follows the official Filesystem MCP server package (`@modelcontextprotocol/server-filesystem`).\n\n### `http` transport\n\nRequired:\n\n- `type: \"http\"`\n- `url: string`\n\nOptional:\n\n- `headers?: Record`\n\nExample:\n\n```json\n{\n \"$schema\": \"https://raw.githubusercontent.com/can1357/oh-my-pi/main/packages/coding-agent/src/config/mcp-schema.json\",\n \"mcpServers\": {\n \"github\": {\n \"type\": \"http\",\n \"url\": \"https://api.githubcopilot.com/mcp/\"\n }\n }\n}\n```\n\nThis matches GitHub's hosted GitHub MCP server endpoint.\n\n### `sse` transport\n\nRequired:\n\n- `type: \"sse\"`\n- `url: string`\n\nOptional:\n\n- `headers?: Record`\n\nExample:\n\n```json\n{\n \"$schema\": \"https://raw.githubusercontent.com/can1357/oh-my-pi/main/packages/coding-agent/src/config/mcp-schema.json\",\n \"mcpServers\": {\n \"legacy-remote\": {\n \"type\": \"sse\",\n \"url\": \"https://example.com/mcp/sse\"\n }\n }\n}\n```\n\n`sse` is still supported for compatibility, but the MCP spec now prefers Streamable HTTP (`type: \"http\"`) for new servers.\n\n## Auth fields\n\nOMP understands two auth-related objects.\n\n### `auth`\n\n```json\n{\n \"type\": \"oauth\" | \"apikey\",\n \"credentialId\": \"optional-stored-credential-id\",\n \"tokenUrl\": \"optional-token-endpoint\",\n \"clientId\": \"optional-client-id\",\n \"clientSecret\": \"optional-client-secret\"\n}\n```\n\nUse this when OMP should remember how to rehydrate credentials for a server.\n\n### `oauth`\n\n```json\n{\n \"clientId\": \"...\",\n \"clientSecret\": \"...\",\n \"redirectUri\": \"...\",\n \"callbackPort\": 3334,\n \"callbackPath\": \"/oauth/callback\"\n}\n```\n\nUse this when the MCP server requires explicit OAuth client settings.\n\nSlack is the clearest current example. Slack's MCP server is hosted at `https://mcp.slack.com/mcp`, uses Streamable HTTP, and requires confidential OAuth with your Slack app's client credentials.\n\nExample:\n\n```json\n{\n \"$schema\": \"https://raw.githubusercontent.com/can1357/oh-my-pi/main/packages/coding-agent/src/config/mcp-schema.json\",\n \"mcpServers\": {\n \"slack\": {\n \"type\": \"http\",\n \"url\": \"https://mcp.slack.com/mcp\",\n \"oauth\": {\n \"clientId\": \"YOUR_SLACK_CLIENT_ID\",\n \"clientSecret\": \"YOUR_SLACK_CLIENT_SECRET\"\n },\n \"auth\": {\n \"type\": \"oauth\",\n \"tokenUrl\": \"https://slack.com/api/oauth.v2.user.access\",\n \"clientId\": \"YOUR_SLACK_CLIENT_ID\",\n \"clientSecret\": \"YOUR_SLACK_CLIENT_SECRET\"\n }\n }\n }\n}\n```\n\nRelevant Slack endpoints from Slack's docs:\n\n- MCP endpoint: `https://mcp.slack.com/mcp`\n- Authorization endpoint: `https://slack.com/oauth/v2_user/authorize`\n- Token endpoint: `https://slack.com/api/oauth.v2.user.access`\n\n## Common copy-paste examples\n\n### Filesystem server via stdio\n\n```json\n{\n \"$schema\": \"https://raw.githubusercontent.com/can1357/oh-my-pi/main/packages/coding-agent/src/config/mcp-schema.json\",\n \"mcpServers\": {\n \"filesystem\": {\n \"command\": \"npx\",\n \"args\": [\n \"-y\",\n \"@modelcontextprotocol/server-filesystem\",\n \"/absolute/path/one\",\n \"/absolute/path/two\"\n ]\n }\n }\n}\n```\n\n### GitHub hosted server via HTTP\n\n```json\n{\n \"$schema\": \"https://raw.githubusercontent.com/can1357/oh-my-pi/main/packages/coding-agent/src/config/mcp-schema.json\",\n \"mcpServers\": {\n \"github\": {\n \"type\": \"http\",\n \"url\": \"https://api.githubcopilot.com/mcp/\"\n }\n }\n}\n```\n\n### GitHub local server via Docker\n\n```json\n{\n \"$schema\": \"https://raw.githubusercontent.com/can1357/oh-my-pi/main/packages/coding-agent/src/config/mcp-schema.json\",\n \"mcpServers\": {\n \"github\": {\n \"command\": \"docker\",\n \"args\": [\n \"run\",\n \"-i\",\n \"--rm\",\n \"-e\",\n \"GITHUB_PERSONAL_ACCESS_TOKEN\",\n \"ghcr.io/github/github-mcp-server\"\n ],\n \"env\": {\n \"GITHUB_PERSONAL_ACCESS_TOKEN\": \"GITHUB_PERSONAL_ACCESS_TOKEN\"\n }\n }\n }\n}\n```\n\nThis matches GitHub's official local Docker image `ghcr.io/github/github-mcp-server`.\n\n### Slack hosted server via OAuth\n\n```json\n{\n \"$schema\": \"https://raw.githubusercontent.com/can1357/oh-my-pi/main/packages/coding-agent/src/config/mcp-schema.json\",\n \"mcpServers\": {\n \"slack\": {\n \"type\": \"http\",\n \"url\": \"https://mcp.slack.com/mcp\",\n \"oauth\": {\n \"clientId\": \"YOUR_SLACK_CLIENT_ID\",\n \"clientSecret\": \"YOUR_SLACK_CLIENT_SECRET\"\n },\n \"auth\": {\n \"type\": \"oauth\",\n \"tokenUrl\": \"https://slack.com/api/oauth.v2.user.access\",\n \"clientId\": \"YOUR_SLACK_CLIENT_ID\",\n \"clientSecret\": \"YOUR_SLACK_CLIENT_SECRET\"\n }\n }\n }\n}\n```\n\n## Secrets and variable resolution\n\nThis is the part that usually trips people up.\n\n### In `.omp/mcp.json` and `~/.omp/agent/mcp.json`\n\nBefore OMP launches a stdio server or makes an HTTP/SSE request, it resolves stdio `env` values and HTTP/SSE `headers` values like this:\n\n1. If a value starts with `!`, OMP runs the rest as a shell command with a 10s timeout and uses trimmed stdout.\n2. If the command fails, times out, or prints only whitespace, that `env`/`headers` entry is omitted.\n3. Otherwise OMP checks whether the value names an environment variable.\n4. If that environment variable is set to a non-empty value, OMP uses the environment value; otherwise it uses the string literally.\n\nExamples:\n\n```json\n{\n \"env\": {\n \"GITHUB_PERSONAL_ACCESS_TOKEN\": \"GITHUB_PERSONAL_ACCESS_TOKEN\"\n },\n \"headers\": {\n \"X-MCP-Insiders\": \"true\"\n }\n}\n```\n\nThat means this is valid and convenient for local secrets:\n\n- `\"GITHUB_PERSONAL_ACCESS_TOKEN\": \"GITHUB_PERSONAL_ACCESS_TOKEN\"` → copy from the current shell environment\n- `\"Authorization\": \"Bearer hardcoded-token\"` → use the literal value\n- `\"Authorization\": \"!printf 'Bearer %s' \\\"$GITHUB_TOKEN\\\"\"` → build the header from a command\n\n### In root `mcp.json` and `.mcp.json`\n\nThe standalone fallback loader also expands `${VAR}` and `${VAR:-default}` inside strings during discovery for `command`, `args`, `env`, `cwd`, `url`, `headers`, `auth`, and `oauth`.\n\nExample:\n\n```json\n{\n \"mcpServers\": {\n \"github\": {\n \"type\": \"http\",\n \"url\": \"https://api.githubcopilot.com/mcp/\",\n \"headers\": {\n \"Authorization\": \"Bearer ${GITHUB_TOKEN}\"\n }\n }\n }\n}\n```\n\nIf you want the least surprising OMP behavior, prefer `.omp/mcp.json` or `~/.omp/agent/mcp.json` and use explicit env/header values.\n\n## `disabledServers`\n\n`disabledServers` is read from the user config file (`~/.omp/agent/mcp.json`) when a server is discovered from any source and you want OMP to ignore it without editing that other tool's config.\n\nExample:\n\n```json\n{\n \"$schema\": \"https://raw.githubusercontent.com/can1357/oh-my-pi/main/packages/coding-agent/src/config/mcp-schema.json\",\n \"disabledServers\": [\"github\", \"slack\"]\n}\n```\n\n## `/mcp add` vs editing JSON directly\n\nUse `/mcp add` when you want guided setup.\n\nUse direct JSON editing when:\n\n- you need a transport or auth option the wizard does not prompt for yet\n- you want to paste a server definition from another MCP client\n- you want schema-backed validation in your editor\n\nAfter editing, use:\n\n- `/mcp reload` to rediscover and reconnect servers in the current session\n- `/mcp list` to see which config file a server came from\n- `/mcp test ` to test a single server\n- `/mcp reconnect ` to reconnect one server without rediscovering all configs\n- `/mcp resources`, `/mcp prompts`, and `/mcp notifications` to inspect non-tool MCP capabilities\n\n## Validation rules OMP enforces\n\nFrom `validateServerConfig()` in `packages/coding-agent/src/mcp/config.ts`:\n\n- `stdio` requires `command`\n- `http` and `sse` require `url`\n- a server cannot set both `command` and `url`\n- unknown `type` values are rejected\n\nPractical implications:\n\n- Omitting `type` means `stdio`\n- If you paste a remote server config and forget `\"type\": \"http\"`, OMP will treat it as `stdio` and complain that `command` is missing\n- `sse` remains valid for compatibility, but new hosted servers should usually be configured as `http`\n\n## Discovery and precedence\n\nOMP does not merge duplicate server definitions across files. Discovery providers are prioritized, and the higher-priority definition wins. Separately, `disabledServers` from `~/.omp/agent/mcp.json` can suppress a discovered server by name.\n\nIn practice:\n\n- prefer `.omp/mcp.json` or `~/.omp/agent/mcp.json` when you want an OMP-specific override\n- keep server names unique across tools when possible\n- use `disabledServers` in the user config when a third-party config keeps reintroducing a server you do not want\n\n## Troubleshooting\n\n### `Server \"name\": stdio server requires \"command\" field`\n\nYou probably omitted `type: \"http\"` on a remote server.\n\n### `Server \"name\": both \"command\" and \"url\" are set`\n\nPick one transport. OMP treats `command` as stdio and `url` as http/sse.\n\n### `/mcp add` worked but the server still does not connect\n\nThe JSON is valid, but the server may still be unreachable. Use `/mcp test ` and check whether:\n\n- the binary or Docker image exists\n- required environment variables are set\n- the remote URL is reachable\n- the OAuth or API token is valid\n\n### The server exists in another tool's config but not in OMP\n\nRun `/mcp list`. OMP discovers many third-party MCP files, but project-level loading can also be disabled via the `mcp.enableProjectConfig` setting, and a user-level `disabledServers` entry can suppress a server by name.\n\n## References\n\n- MCP transport spec: https://modelcontextprotocol.io/specification/2025-03-26/basic/transports\n- Filesystem server package: https://www.npmjs.com/package/@modelcontextprotocol/server-filesystem\n- GitHub MCP server: https://github.com/github/github-mcp-server\n- Slack MCP server docs: https://docs.slack.dev/ai/slack-mcp-server/\n", "mcp-protocol-transports.md": "# MCP Protocol and Transport Internals\n\nThis document describes how coding-agent implements MCP JSON-RPC messaging and how protocol concerns are split from transport concerns.\n\n## Scope\n\nCovers:\n\n- JSON-RPC request/response and notification flow\n- Server-to-client request handling (`ping`, `roots/list`)\n- Request correlation and lifecycle for stdio and HTTP/SSE transports\n- Timeout, cancellation, and auth-refresh behavior\n- Error propagation and malformed payload handling\n- Transport selection boundaries (`stdio` vs `http`/`sse`)\n- Which reconnect/retry responsibilities are transport-level vs manager/tool-bridge-level\n\nDoes not cover extension authoring UX or command UI.\n\n## Implementation files\n\n- [`src/mcp/types.ts`](../packages/coding-agent/src/mcp/types.ts)\n- [`src/mcp/transports/stdio.ts`](../packages/coding-agent/src/mcp/transports/stdio.ts)\n- [`src/mcp/transports/http.ts`](../packages/coding-agent/src/mcp/transports/http.ts)\n- [`src/mcp/transports/index.ts`](../packages/coding-agent/src/mcp/transports/index.ts)\n- [`src/mcp/json-rpc.ts`](../packages/coding-agent/src/mcp/json-rpc.ts)\n- [`src/mcp/client.ts`](../packages/coding-agent/src/mcp/client.ts)\n- [`src/mcp/manager.ts`](../packages/coding-agent/src/mcp/manager.ts)\n\n## Layer boundaries\n\n### Protocol layer (JSON-RPC + MCP methods)\n\n- Message shapes are defined in `types.ts` (`JsonRpcRequest`, `JsonRpcNotification`, `JsonRpcResponse`, `JsonRpcMessage`).\n- MCP client logic (`client.ts`) decides method order and session handshake:\n 1. `initialize` request\n 2. for HTTP/SSE transports, start the optional background SSE listener after the initialize response has established any session id\n 3. `notifications/initialized` notification\n 4. method calls like `tools/list`, `tools/call`\n\n### Transport layer (`MCPTransport`)\n\n`MCPTransport` abstracts delivery and lifecycle:\n\n- `request(method, params, options?) -> Promise`\n- `notify(method, params?) -> Promise`\n- `close()`\n- `connected`\n- optional callbacks: `onClose`, `onError`, `onNotification`, `onRequest`\n\nTransport implementations own framing and I/O details:\n\n- `StdioTransport`: newline-delimited JSON over subprocess stdio\n- `HttpTransport`: JSON-RPC over HTTP POST, with optional SSE responses/listening\n\n### Manager/client wiring\n\n`connectToServer()` always installs an `onRequest` handler for standard server-to-client requests. `MCPManager` installs notification handlers, OAuth refresh hooks for HTTP OAuth servers, and `onClose` reconnect handling for managed connections.\n\n## Transport selection\n\n`client.ts:createTransport()` chooses transport from config:\n\n- `type` omitted or `\"stdio\"` -> `createStdioTransport`\n- `\"http\"` or `\"sse\"` -> `createHttpTransport`\n\n`\"sse\"` is treated as an HTTP transport variant (same class), not a separate transport implementation.\n\n## JSON-RPC message flow and correlation\n\n## Request IDs\n\nEach transport generates per-request IDs with `Snowflake.next()`. IDs are transport-local correlation tokens.\n\n## Stdio correlation path\n\n- Outbound request is serialized as one JSON object + `\\n`.\n- `#pendingRequests: Map` stores in-flight requests.\n- Read loop parses JSONL from stdout and calls `#handleMessage`.\n- If inbound message has matching `id`, request resolves/rejects.\n- If inbound message has `method` and no `id`, treated as notification and sent to `onNotification`.\n- If inbound message has both `method` and `id`, treated as a server-to-client request and answered through `onRequest`; without a handler the transport replies with JSON-RPC `-32601 Method not found`.\n\nUnknown response IDs are ignored (no rejection, no error callback).\n\n## HTTP correlation path\n\n- Outbound request is HTTP `POST` with JSON body and generated `id`.\n- Non-SSE response path: parse one JSON-RPC response and return `result`/throw on `error`.\n- SSE response path (`Content-Type: text/event-stream`): stream events, return first message whose `id` matches expected request ID and has `result` or `error`.\n- SSE messages with `method` and no `id` are treated as notifications.\n- SSE messages with both `method` and `id` are treated as server-to-client requests and answered with a POSTed JSON-RPC response.\n\nIf SSE stream ends before matching response, request fails with `No response received for request ID ...`. After the matching response is captured, the transport drains remaining SSE messages in the background.\n\n## Notifications\n\nClient emits JSON-RPC notifications via `transport.notify(...)`.\n\n- Stdio: writes notification frame to stdin (`jsonrpc`, `method`, optional `params`) plus newline.\n- HTTP: sends POST body without `id`; success accepts `2xx` or `202 Accepted`.\n\nServer-initiated notifications are surfaced through transport `onNotification`; `MCPManager` consumes known MCP list/update notifications and can forward all notifications through its own callback.\n\n## Stdio transport internals\n\n## Lifecycle and state transitions\n\n- Initial: `connected=false`, `process=null`, pending map empty\n- `connect()`:\n - spawn subprocess with configured command/args/env/cwd\n - mark connected\n - start stdout read loop (`readJsonl`)\n - start stderr loop (read/discard; currently silent)\n- `close()`:\n - mark disconnected\n - reject all pending requests (`Transport closed`)\n - kill subprocess\n - await read loop shutdown\n - emit `onClose`\n\nIf read loop exits unexpectedly, `finally` triggers `#handleClose()` which performs the same pending-request rejection and close callback.\n\n## Timeout and cancellation\n\nPer request:\n\n- timeout defaults to `config.timeout ?? 30000`\n- optional `AbortSignal` from caller\n- abort and timeout both reject the pending promise and clean map entry\n\nCancellation is local only: transport does not send protocol-level cancellation notification to the server.\n\n## Malformed payload handling\n\nIn read loop:\n\n- each parsed JSONL line is passed to `#handleMessage` in `try/catch`\n- malformed/invalid message handling exceptions are dropped (`Skip malformed lines` comment)\n- loop continues, so one bad message does not kill the connection\n\nIf the underlying stream parser throws, `onError` is invoked (when still connected), then connection closes.\n\n## Disconnect/failure behavior\n\nWhen process exits or stream closes:\n\n- all in-flight requests are rejected with `Transport closed`\n- no automatic restart or reconnect\n- higher layers must reconnect by creating a new transport\n\n## Backpressure/streaming notes\n\n- Outbound writes use `stdin.write()` + `flush()` without awaiting drain semantics.\n- There is no explicit queue or high-watermark management in transport.\n- Inbound processing is stream-driven (`for await` over `readJsonl`), one parsed message at a time.\n\n## HTTP/SSE transport internals\n\n## Lifecycle and connection semantics\n\nHTTP transport has logical connection state, but request path is stateless per HTTP call:\n\n- `connect()` sets `connected=true` (no socket/session handshake)\n- optional server session tracking via `Mcp-Session-Id` header\n- `close()` optionally sends `DELETE` with `Mcp-Session-Id`, aborts SSE listener, emits `onClose`\n\nSo `connected` means \"transport usable\", not \"persistent stream established\".\n\n## Session header behavior\n\n- On POST response, if `Mcp-Session-Id` header is present, transport stores it.\n- Subsequent requests/notifications include `Mcp-Session-Id`.\n- `close()` tries to terminate server session with HTTP DELETE; termination failures are ignored.\n\n## Timeout, cancellation, and auth refresh\n\nFor `request()`:\n\n- timeout uses `AbortController` (`config.timeout ?? 30000`)\n- external signal, if provided, is merged via `AbortSignal.any([...])`\n- AbortError handling distinguishes caller abort vs timeout\n\nFor `notify()`:\n\n- timeout uses an internal `AbortController` (`config.timeout ?? 30000`)\n- there is no external abort option on the transport interface\n\nFor HTTP OAuth configs managed by `MCPManager`, `request()` retries once on `HTTP 401`/`403` if token refresh returns replacement headers.\n\n## HTTP error propagation\n\nOn non-OK response:\n\n- response text is included in thrown error (`HTTP : `)\n- if present, auth hints from `WWW-Authenticate` and `Mcp-Auth-Server` are appended\n\nOn JSON-RPC error object:\n\n- throws `MCP error : `\n\nMalformed JSON body (`response.json()` failure) propagates as parse exception.\n\n## SSE behavior and modes\n\nTwo SSE paths exist:\n\n1. **Per-request SSE response** (`#parseSSEResponse`)\n - used when POST response content type is `text/event-stream`\n - consumes stream until matching response id found\n - can process interleaved notifications during same stream\n\n2. **Background SSE listener** (`startSSEListener()`)\n - optional GET listener for server-initiated notifications and server-to-client requests\n - `connectToServer()` starts it for HTTP/SSE transports after `initialize` and before `notifications/initialized`\n - if GET returns `405`, another non-OK status, or no body, listener silently disables itself\n\n## Malformed payload and disconnect handling\n\nSSE JSON parsing errors bubble out of `readSseJson` and reject request/listener.\n\n- Request SSE parse errors reject the active request.\n- Background listener errors trigger `onError` (except AbortError).\n- Transport does not restart the listener itself; managed connections may reconnect through manager `onClose` handling.\n\n## `json-rpc.ts` utility vs transport abstraction\n\n`src/mcp/json-rpc.ts` provides `callMCP()` and `parseSSE()` helpers for direct HTTP MCP calls (used by Exa integration), not the `MCPTransport` abstraction used by `MCPClient`/`MCPManager`.\n\nNotable differences from `HttpTransport`:\n\n- parses entire response text first, then extracts first `data: ` line (`parseSSE`), with JSON fallback\n- no request timeout management, no abort API, no session-id handling, no transport lifecycle\n- returns raw JSON-RPC envelope object\n\nThis path is lightweight but less robust than full transport implementation.\n\n## Retry/reconnect responsibilities\n\n## Transport-level\n\nCurrent transport implementations do **not**:\n\n- retry ordinary failed requests, except the HTTP transport's single OAuth-refresh retry when `onAuthError` is wired\n- reconnect after stdio process exit\n- reconnect SSE listeners by themselves\n- resend in-flight requests after disconnect\n\nThey fail fast and propagate errors.\n\n## Manager/tool-bridge level\n\n`MCPManager` wires `transport.onClose` for managed connections and runs `reconnectServer(name)` when a transport closes unexpectedly. Reconnect tears down the stale connection, re-resolves auth/config values, retries with backoff (`500`, `1000`, `2000`, `4000` ms), reloads tools, and preserves stale tools while reconnecting.\n\n`MCPTool` and `DeferredMCPTool` also attempt one reconnect + retry for retriable connection errors during a tool call. This is tool availability recovery, not transport-level retry.\n\n## Failure scenarios summary\n\n- **Malformed stdio message line**: dropped; stream continues.\n- **Stdio stream/process ends**: transport closes; pending requests rejected as `Transport closed`; manager-managed connections trigger reconnect.\n- **HTTP non-2xx**: request/notify throws HTTP error; managed OAuth requests can refresh auth and retry once on 401/403.\n- **Invalid JSON response**: parse exception propagated.\n- **SSE ends without matching id**: request fails with `No response received for request ID ...`.\n- **Timeout**: transport-specific timeout error.\n- **Caller abort**: AbortError/reason propagated from caller signal where the method accepts one.\n\n## Practical boundary rule\n\nIf the concern is message shape, id correlation, or MCP method ordering, it belongs to protocol/client logic.\n\nIf the concern is framing (JSONL vs HTTP/SSE), stream parsing, fetch/spawn lifecycle, timeout clocks, or connection teardown, it belongs to transport implementation.\n", "mcp-runtime-lifecycle.md": "# MCP runtime lifecycle\n\nThis document describes how MCP servers are discovered, connected, exposed as tools, refreshed, and torn down in the coding-agent runtime.\n\n## Lifecycle at a glance\n\n1. **SDK startup** calls `discoverAndLoadMCPTools()` (unless MCP is disabled).\n2. **Discovery** (`loadAllMCPConfigs`) resolves MCP server configs from capability sources, filters disabled/project/Exa entries, and preserves source metadata.\n3. **Manager connect phase** (`MCPManager.connectServers`) starts per-server connect + `tools/list` in parallel.\n4. **Fast startup gate** waits up to 250ms, then may return:\n - fully loaded `MCPTool`s,\n - failures per server,\n - or cached `DeferredMCPTool`s for still-pending servers.\n5. **SDK wiring** merges MCP tools into runtime tool registry for the session.\n6. **Post-connect enrichment** best-effort loads resources, resource templates, prompts, and optional resource subscriptions.\n7. **Live session** can refresh MCP tools via `/mcp` flows (`disconnectAll` + rediscover + `session.refreshMCPTools`) and can reconnect individual servers on transport close or `/mcp reconnect`.\n8. **Teardown** happens when callers invoke `disconnectServer`/`disconnectAll`; manager also clears MCP tool/resource/prompt registrations for disconnected servers.\n\n## Discovery and load phase\n\n### Entry path from SDK\n\n`createAgentSession()` in `src/sdk.ts` performs MCP startup when `enableMCP` is true (default):\n\n- calls `discoverAndLoadMCPTools(cwd, { ... })`,\n- passes `authStorage`, cache storage, and `mcp.enableProjectConfig` setting,\n- always sets `filterExa: true`,\n- logs per-server load/connect errors,\n- stores returned manager in `toolSession.mcpManager` and session result.\n\nIf `enableMCP` is false, MCP discovery is skipped entirely.\n\n### Config discovery and filtering\n\n`loadAllMCPConfigs()` (`src/mcp/config.ts`) loads canonical MCP server items through capability discovery, then converts to legacy `MCPServerConfig`.\n\nFiltering behavior:\n\n- `enableProjectConfig: false` removes project-level entries (`_source.level === \"project\"`).\n- `enabled: false` servers are skipped before connect attempts.\n- Exa servers are filtered out by default and API keys are extracted for native Exa tool integration.\n\nResult includes both `configs` and `sources` (metadata used later for provider labeling).\n\n### Discovery-level failure behavior\n\n`discoverAndLoadMCPTools()` distinguishes two failure classes:\n\n- **Discovery hard failure** (exception from `manager.discoverAndConnect`, typically from config discovery): returns an empty tool set and one synthetic error `{ path: \".mcp.json\", error }`.\n- **Per-server runtime/connect failure**: manager returns partial success with `errors` map; other servers continue.\n\nSo startup does not fail the whole agent session when individual MCP servers fail.\n\n## Manager state model\n\n`MCPManager` tracks runtime lifecycle with separate registries:\n\n- `#connections: Map` — fully connected servers.\n- `#pendingConnections: Map>` — handshake in progress.\n- `#pendingToolLoads: Map>` — connected but tools still loading.\n- `#tools: CustomTool[]` — current MCP tool view exposed to callers.\n- `#sources: Map` — provider/source metadata even before connect completes.\n- `#pendingReconnections: Map>` — reconnects in progress after a dropped transport or explicit reconnect.\n- `#serverConfigs: Map` — original unresolved configs preserved so reconnect can re-resolve credentials without leaking resolved tokens.\n\n`getConnectionStatus(name)` derives status from these maps:\n\n- `connected` if in `#connections`,\n- `connecting` if pending connect, pending tool load, or pending reconnect,\n- `disconnected` otherwise.\n\n## Connection establishment and startup timing\n\n## Per-server connect pipeline\n\nFor each discovered server in `connectServers()`:\n\n1. store/update source metadata,\n2. skip if already connected/pending/reconnecting,\n3. validate transport fields (`validateServerConfig`),\n4. resolve auth/shell substitutions (`#resolveAuthConfig`),\n5. call `connectToServer(name, resolvedConfig)` with manager notification/request handlers,\n6. wire HTTP OAuth refresh and transport `onClose` reconnect handling,\n7. call `listTools(connection)`,\n8. cache tool definitions (`MCPToolCache.set`) best-effort,\n9. best-effort load resources, resource templates, prompts, and subscriptions after tools load.\n\n`connectToServer()` behavior (`src/mcp/client.ts`):\n\n- creates stdio or HTTP/SSE transport,\n- performs MCP `initialize`,\n- for HTTP/SSE, starts the optional background SSE listener before `notifications/initialized`,\n- sends `notifications/initialized`,\n- uses timeout (`config.timeout` or 30s default),\n- closes transport on init failure.\n\n### Fast startup gate + deferred fallback\n\n`connectServers()` waits on a race between:\n\n- all connect/tool-load tasks settled, and\n- `STARTUP_TIMEOUT_MS = 250`.\n\nAfter 250ms:\n\n- fulfilled tasks become live `MCPTool`s,\n- rejected tasks produce per-server errors,\n- still-pending tasks:\n - use cached tool definitions if available (`MCPToolCache.get`) to create `DeferredMCPTool`s,\n - otherwise block until those pending tasks settle.\n\nThis is a hybrid startup model: fast return when cache is available, correctness wait when cache is not.\n\n### Background completion behavior\n\nEach pending `toolsPromise` also has a background continuation that eventually:\n\n- replaces that server’s tool slice in manager state via `#replaceServerTools`,\n- writes cache,\n- logs late failures only after startup (`allowBackgroundLogging`).\n\n## Tool exposure and live-session availability\n\n### Startup registration\n\n`discoverAndLoadMCPTools()` converts manager tools into `LoadedCustomTool[]` and decorates paths (`mcp: via ` when known).\n\n`createAgentSession()` then pushes these tools into `customTools`, which are wrapped and added to the runtime tool registry with names like `mcp___`.\n\n### Tool calls\n\n- `MCPTool` calls tools through an already connected `MCPServerConnection`.\n- `DeferredMCPTool` waits for `waitForConnection(server)` before calling; this allows cached tools to exist before connection is ready.\n- Both attempt a reconnect + single retry for retriable connection failures.\n\nBoth return structured tool output and convert remaining transport/tool errors into `MCP error: ...` tool content (abort remains abort).\n\n## Refresh/reload paths (startup vs live reload)\n\n### Initial startup path\n\n- one-time discovery/load in `sdk.ts`,\n- tools are registered in initial session tool registry.\n\n### Interactive reload path\n\n`/mcp reload` path (`src/modes/controllers/mcp-command-controller.ts`) does:\n\n1. `mcpManager.disconnectAll()`,\n2. `mcpManager.discoverAndConnect()`,\n3. `session.refreshMCPTools(mcpManager.getTools())`.\n\n`session.refreshMCPTools()` (`src/session/agent-session.ts`) removes all `mcp__` tools, re-wraps latest MCP tools, and re-activates tool set so MCP changes apply without restarting session.\n\nThere is also a follow-up path for late connections: after waiting for a specific server, if status becomes `connected`, it re-runs `session.refreshMCPTools(...)` so newly available tools are rebound in-session.\n\n## Health, reconnect, and partial failure behavior\n\nCurrent runtime behavior is connection-event driven:\n\n- **No autonomous polling health monitor** in manager/client.\n- **Automatic reconnect is wired to `transport.onClose`** for managed connections.\n- Reconnect retries with backoff (`500`, `1000`, `2000`, `4000` ms), reloads tools, and notifies consumers on success.\n- Tool calls that see retriable connection errors also attempt one reconnect + retry.\n- Reconnect is also explicit via `/mcp reconnect ` or broader `/mcp reload`.\n\nOperationally:\n\n- one server failing does not remove tools from healthy servers,\n- connect/list failures are isolated per server,\n- stale tools may remain visible while reconnect is attempted; calls report MCP errors if recovery fails,\n- tool cache, resource/prompt loading, subscriptions, and background updates are best-effort (warnings/errors logged, no hard stop).\n\n## Teardown semantics\n\n### Server-level teardown\n\n`disconnectServer(name)`:\n\n- removes pending entries, source metadata, saved config, resource refresh/subscription state,\n- detaches `onClose` so explicit close does not trigger reconnect,\n- closes transport if connected,\n- filters manager tool state for names beginning with `mcp__${name}_`.\n\n### Global teardown\n\n`disconnectAll()`:\n\n- detaches `onClose` for all active transports, then closes them with `Promise.allSettled`,\n- clears pending maps, sources, saved configs, connections, subscriptions, resource refreshes, and manager tool list.\n\nIn current wiring, explicit teardown is used in MCP command flows (for reload/remove/disable). Startup stores the manager on the session; callers that need deterministic MCP shutdown should invoke manager disconnect methods.\n\n## Failure modes and guarantees\n\n| Scenario | Behavior | Hard fail vs best-effort |\n| ---------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------- | ------------------------------ |\n| Discovery throws (capability/config load path) | Loader returns empty tools + synthetic `.mcp.json` error | Best-effort session startup |\n| Invalid server config | Server skipped with validation error entry | Best-effort per server |\n| Connect timeout/init failure | Server error recorded; others continue | Best-effort per server |\n| `tools/list` still pending at startup with cache hit | Deferred tools returned immediately | Best-effort fast startup |\n| `tools/list` still pending at startup without cache | Startup waits for pending to settle | Hard wait for correctness |\n| Late background tool-load failure | Logged after startup gate | Best-effort logging |\n| Runtime dropped transport | Manager attempts reconnect; stale tools remain while reconnecting and future calls may retry once or fail with MCP errors | Best-effort automatic recovery |\n\n## Public API surface\n\n`src/mcp/index.ts` re-exports loader/manager/client APIs for external callers. `src/sdk.ts` exposes `discoverMCPServers()` as a convenience wrapper returning the same loader result shape.\n\n## Implementation files\n\n- [`src/mcp/loader.ts`](../packages/coding-agent/src/mcp/loader.ts) — loader facade, discovery error normalization, `LoadedCustomTool` conversion.\n- [`src/mcp/manager.ts`](../packages/coding-agent/src/mcp/manager.ts) — lifecycle state registries, parallel connect/list flow, refresh/disconnect.\n- [`src/mcp/client.ts`](../packages/coding-agent/src/mcp/client.ts) — transport setup, initialize handshake, list/call/disconnect.\n- [`src/mcp/index.ts`](../packages/coding-agent/src/mcp/index.ts) — MCP module API exports.\n- [`src/sdk.ts`](../packages/coding-agent/src/sdk.ts) — startup wiring into session/tool registry.\n- [`src/mcp/config.ts`](../packages/coding-agent/src/mcp/config.ts) — config discovery/filtering/validation used by manager.\n- [`src/mcp/tool-bridge.ts`](../packages/coding-agent/src/mcp/tool-bridge.ts) — `MCPTool` and `DeferredMCPTool` runtime behavior.\n- [`src/session/agent-session.ts`](../packages/coding-agent/src/session/agent-session.ts) — `refreshMCPTools` live rebinding.\n- [`src/modes/controllers/mcp-command-controller.ts`](../packages/coding-agent/src/modes/controllers/mcp-command-controller.ts) — interactive reload/reconnect flows.\n- [`src/task/executor.ts`](../packages/coding-agent/src/task/executor.ts) — subagent MCP proxying via parent manager connections.\n", "mcp-server-tool-authoring.md": "# MCP server and tool authoring\n\nThis document explains how MCP server definitions become callable `mcp__*` tools in coding-agent, and what operators should expect when configs are invalid, duplicated, disabled, or auth-gated.\n\n## Architecture at a glance\n\n```text\nConfig sources (.omp/.claude/.cursor/.vscode/mcp.json, mcp.json, etc.)\n -> discovery providers normalize to canonical MCPServer\n -> capability loader dedupes by server name (higher provider priority wins)\n -> loadAllMCPConfigs converts to MCPServerConfig + skips enabled:false\n -> MCPManager connects/listTools (with auth/header/env resolution)\n -> manager best-effort loads resources/prompts and subscribes to resource updates when enabled\n -> MCPTool/DeferredMCPTool bridge exposes tools as mcp___\n -> AgentSession.refreshMCPTools replaces live MCP tools immediately\n```\n\n## 1) Server config model and validation\n\n`src/mcp/types.ts` defines the authoring shape used by MCP config writers and runtime:\n\n- `stdio` (default when `type` missing): requires `command`, optional `args`, `env`, `cwd`\n- `http`: requires `url`, optional `headers`\n- `sse`: requires `url`, optional `headers` (kept for compatibility)\n- shared fields: `enabled`, `timeout`, `auth`, `oauth`\n\n`validateServerConfig()` (`src/mcp/config.ts`) enforces transport basics:\n\n- rejects configs that set both `command` and `url`\n- requires `command` for stdio\n- requires `url` for http/sse\n- rejects unknown `type`\n\n`config-writer.ts` applies this validation for add/update operations and also validates server names:\n\n- non-empty\n- max 100 chars\n- only `[a-zA-Z0-9_.-]`\n\n### Transport pitfalls\n\n- `type` omitted means stdio. If you intended HTTP/SSE but omitted `type`, `command` becomes mandatory.\n- `sse` is still accepted but treated as HTTP transport internally (`createHttpTransport`).\n- Validation is structural, not reachability: a syntactically valid URL can still fail at connect time.\n\n## 2) Discovery, normalization, and precedence\n\n### Capability-based discovery\n\n`loadAllMCPConfigs()` (`src/mcp/config.ts`) loads canonical `MCPServer` items via `loadCapability(mcpCapability.id)`.\n\nThe capability layer (`src/capability/index.ts`) then:\n\n1. loads providers in priority order\n2. dedupes by `server.name` (first win = highest priority)\n3. validates deduped items\n\nResult: duplicate server names across sources are not merged. One definition wins; lower-priority duplicates are shadowed.\n\n### `.mcp.json` and related files\n\nThe dedicated fallback provider in `src/discovery/mcp-json.ts` reads project-root `mcp.json` and `.mcp.json` (low priority).\n\nIn practice MCP servers also come from higher-priority providers (for example native `.omp/...` and tool-specific config dirs). Authoring guidance:\n\n- Prefer `.omp/mcp.json` (project) or `~/.omp/agent/mcp.json` (user) for explicit control.\n- Use root `mcp.json` / `.mcp.json` when you need fallback compatibility.\n- Reusing the same server name in multiple sources causes precedence shadowing, not merge.\n\n### Normalization behavior\n\n`convertToLegacyConfig()` (`src/mcp/config.ts`) maps canonical `MCPServer` to runtime `MCPServerConfig`.\n\nKey behavior:\n\n- transport inferred as `server.transport ?? (command ? \"stdio\" : url ? \"http\" : \"stdio\")`\n- disabled servers (`enabled === false`) are dropped before connection\n- optional fields are preserved when present\n\n### Environment expansion during discovery\n\n`mcp-json.ts` expands env placeholders in string fields with `expandEnvVarsDeep()`:\n\n- supports `${VAR}` and `${VAR:-default}`\n- unresolved values remain literal `${VAR}` strings\n\n`mcp-json.ts` also performs runtime type checks for user JSON and logs warnings for invalid `enabled`/`timeout` values instead of hard-failing the whole file.\n\n## 3) Auth and runtime value resolution\n\n`MCPManager.prepareConfig()`/`#resolveAuthConfig()` (`src/mcp/manager.ts`) is the final pre-connect pass.\n\n### OAuth credential injection\n\nIf config has:\n\n```ts\nauth: { type: \"oauth\", credentialId: \"...\" }\n```\n\nand credential exists in auth storage:\n\n- `http`/`sse`: injects `Authorization: Bearer ` header\n- `stdio`: injects `OAUTH_ACCESS_TOKEN` env var\n\nIf credential lookup fails, manager logs a warning and continues with unresolved auth.\n\n### Header/env value resolution\n\nBefore connect, manager resolves stdio `env` values and HTTP/SSE `headers` values via `resolveConfigValue()` (`src/config/resolve-config-value.ts`):\n\n- value starting with `!` => execute shell command, use trimmed stdout (cached)\n- failed, timed-out, or whitespace-only commands produce `undefined`, so that entry is omitted\n- otherwise, treat value as environment variable name first (`process.env[name]`), fallback to literal value\n\nOperational caveat: a mistyped `!` secret command can silently remove that header/env entry, producing downstream 401/403 or server startup failures. A mistyped environment variable name is sent literally unless that literal happens to be meaningful to the server.\n\n## 4) Tool bridge: MCP -> agent-callable tools\n\n`src/mcp/tool-bridge.ts` converts MCP tool definitions into `CustomTool`s.\n\n### Naming and collision domain\n\nTool names are generated as:\n\n```text\nmcp___\n```\n\nRules:\n\n- lowercases\n- non-`[a-z_]` chars become `_`\n- repeated underscores collapse\n- redundant `_` prefix in tool name is stripped once\n\nThis avoids many collisions, but not all. Different raw names can still sanitize to the same identifier (for example `my-server` and `my.server` both sanitize similarly), and registry insertion is last-write-wins.\n\n### Schema mapping\n\n`tool-bridge.ts` passes each MCP `inputSchema` through `sanitizeSchemaForMCP()` before registering it as a `CustomTool` schema.\n\n### Execution mapping\n\n`MCPTool.execute()` / `DeferredMCPTool.execute()`:\n\n- calls MCP `tools/call`\n- flattens MCP content into displayable text\n- returns structured details (`serverName`, `mcpToolName`, provider metadata)\n- maps server-reported `isError` to `Error: ...` text result\n- attempts reconnect + one retry for retriable connection errors\n- maps remaining thrown transport/runtime failures to `MCP error: ...`\n- preserves abort semantics by translating AbortError into `ToolAbortError`\n\n## 5) Operator lifecycle: add/edit/remove and live updates\n\nInteractive mode exposes `/mcp` in `src/modes/controllers/mcp-command-controller.ts`.\n\nSupported operations:\n\n- `add` (wizard or quick-add)\n- `remove` / `rm`\n- `enable` / `disable`\n- `test`\n- `reauth` / `unauth`\n- `reconnect`\n- `reload`\n- `resources`, `prompts`, `notifications`\n- Smithery search/login/logout flows\n\nConfig writes are atomic (`writeMCPConfigFile`: temp file + rename).\n\nAfter changes, controller calls `#reloadMCP()`:\n\n1. `mcpManager.disconnectAll()`\n2. `mcpManager.discoverAndConnect()`\n3. `session.refreshMCPTools(mcpManager.getTools())`\n\n`refreshMCPTools()` replaces all `mcp__` registry entries and immediately re-activates the latest MCP tool set, so changes take effect without restarting the session.\n\n### Mode differences\n\n- **Interactive/TUI mode**: `/mcp` gives in-app UX (wizard, OAuth flow, connection status text, immediate runtime rebinding).\n- **SDK/headless integration**: `discoverAndLoadMCPTools()` (`src/mcp/loader.ts`) returns loaded tools + per-server errors; no `/mcp` command UX.\n\n## 6) User-visible error surfaces\n\nCommon error strings users/operators see:\n\n- add/update validation failures:\n - `Invalid server config: ...`\n - `Server \"\" already exists in `\n- quick-add argument issues:\n - `Use either --url or -- , not both.`\n - `--token requires --url (HTTP/SSE transport).`\n- connect/test failures:\n - `Failed to connect to \"\": `\n - timeout help text suggests increasing timeout\n - auth help text for `401/403`\n- auth/OAuth flows:\n - `Authentication required ... OAuth endpoints could not be discovered`\n - `OAuth flow timed out. Please try again.`\n - `OAuth authentication failed: ...`\n- disabled server usage:\n - `Server \"\" is disabled. Run /mcp enable first.`\n\nBad source JSON in discovery is generally handled as warnings/logs; config-writer paths throw explicit errors.\n\n## 7) Practical authoring guidance\n\nFor robust MCP authoring in this codebase:\n\n1. Keep server names globally unique across all MCP-capable config sources.\n2. Prefer names that remain distinct after MCP tool-name sanitization to avoid generated `mcp__` collisions.\n3. Use explicit `type` to avoid accidental stdio defaults.\n4. Treat `enabled: false` as hard-off: server is omitted from runtime connect set.\n5. For OAuth configs, store a valid `credentialId`; otherwise auth injection is skipped.\n6. If using command-based secret resolution (`!cmd`), verify command output is stable and non-empty.\n\n## Implementation files\n\n- [`src/mcp/types.ts`](../packages/coding-agent/src/mcp/types.ts)\n- [`src/mcp/config.ts`](../packages/coding-agent/src/mcp/config.ts)\n- [`src/mcp/config-writer.ts`](../packages/coding-agent/src/mcp/config-writer.ts)\n- [`src/mcp/tool-bridge.ts`](../packages/coding-agent/src/mcp/tool-bridge.ts)\n- [`src/discovery/mcp-json.ts`](../packages/coding-agent/src/discovery/mcp-json.ts)\n- [`src/modes/controllers/mcp-command-controller.ts`](../packages/coding-agent/src/modes/controllers/mcp-command-controller.ts)\n- [`src/mcp/manager.ts`](../packages/coding-agent/src/mcp/manager.ts)\n- [`src/capability/index.ts`](../packages/coding-agent/src/capability/index.ts)\n- [`src/config/resolve-config-value.ts`](../packages/coding-agent/src/config/resolve-config-value.ts)\n- [`src/mcp/loader.ts`](../packages/coding-agent/src/mcp/loader.ts)\n", "memory.md": "# Autonomous Memory\n\nWhen enabled, the agent automatically extracts durable knowledge from past sessions and injects a compact summary into each new session. Over time it builds a project-scoped memory store — technical decisions, recurring workflows, pitfalls — that carries forward without manual effort.\n\nDisabled by default. Enable via `/settings` or `config.yml`:\n\n```yaml\nmemories:\n enabled: true\n```\n\n## Usage\n\n### What gets injected\n\nAt session start, if a memory summary exists for the current project, it is injected into the system prompt as a **Memory Guidance** block. The agent is instructed to:\n\n- Treat memory as heuristic context — useful for process and prior decisions, not authoritative on current repo state.\n- Cite the memory artifact path when memory changes the plan, and pair it with current-repo evidence before acting.\n- Prefer repo state and user instruction when they conflict with memory; treat conflicting memory as stale.\n\n### Reading memory artifacts\n\nThe agent can read memory files directly using `memory://` URLs with the `read` tool:\n\n| URL | Content |\n| -------------------------------------- | ----------------------------------- |\n| `memory://root` | Compact summary injected at startup |\n| `memory://root/MEMORY.md` | Full long-term memory document |\n| `memory://root/skills//SKILL.md` | A generated skill playbook |\n\n### `/memory` slash command\n\n| Subcommand | Effect |\n| --------------------- | ---------------------------------------------- |\n| `view` | Show the current memory injection payload |\n| `clear` / `reset` | Delete all memory data and generated artifacts |\n| `enqueue` / `rebuild` | Force consolidation to run at next startup |\n\n## How it works\n\nMemories are built by a background pipeline that runs at startup or when manually triggered via slash command.\n\n**Phase 1 — per-session extraction:** For each past session that has changed since it was last processed, a model reads the session history and extracts durable signal: technical decisions, constraints, resolved failures, recurring workflows. Sessions that are too recent, too old, or currently active are skipped. Each extraction produces a raw memory block and a short synopsis for that session.\n\n**Phase 2 — consolidation:** After extraction, a second model pass reads all per-session extractions and produces three outputs written to disk:\n\n- `MEMORY.md` — a curated long-term memory document\n- `memory_summary.md` — the compact text injected at session start\n- `skills/` — reusable procedural playbooks, each in its own subdirectory\n\nPhase 2 uses a lease to prevent double-running when multiple processes start simultaneously. Stale skill directories from prior runs are pruned automatically.\n\nAll output is scanned for secrets before being written to disk.\n\n### Extraction behavior\n\nMemory extraction and consolidation behavior is driven by static prompt files in `packages/coding-agent/src/prompts/memories/`.\n\n| File | Purpose | Variables |\n| --------------------- | ------------------------------------------- | ------------------------------------------- |\n| `stage_one_system.md` | System prompt for per-session extraction | — |\n| `stage_one_input.md` | User-turn template wrapping session content | `{{thread_id}}`, `{{response_items_json}}` |\n| `consolidation.md` | Prompt for cross-session consolidation | `{{raw_memories}}`, `{{rollout_summaries}}` |\n| `read_path.md` | Memory guidance injected into live sessions | `{{memory_summary}}` |\n\n### Model selection\n\nMemory piggybacks on the model role system.\n\n| Phase | Role | Purpose |\n| ----------------------- | ------------------------------------------------------------------- | -------------------------------- |\n| Phase 1 (extraction) | `default` | Per-session knowledge extraction |\n| Phase 2 (consolidation) | `smol` (falls back to `default`, then current/first registry model) | Cross-session synthesis |\n\nIf the requested memory role is not configured, memory model resolution falls back to the `default` role, then the active session model, then the first model in the registry.\n\n## Configuration\n\n| Setting | Default | Description |\n| ------------------------------------- | ------- | --------------------------------------------------------- |\n| `memories.enabled` | `false` | Master switch |\n| `memories.maxRolloutAgeDays` | `30` | Sessions older than this are not processed |\n| `memories.minRolloutIdleHours` | `12` | Sessions active more recently than this are skipped |\n| `memories.maxRolloutsPerStartup` | `64` | Cap on sessions processed in a single startup |\n| `memories.summaryInjectionTokenLimit` | `5000` | Max tokens of the summary injected into the system prompt |\n\nAdditional tuning knobs (concurrency, lease durations, token budgets) are available in config for advanced use.\n\n## Key files\n\n- `packages/coding-agent/src/memories/index.ts` — pipeline orchestration, injection, slash command handling\n- `packages/coding-agent/src/memories/storage.ts` — SQLite-backed job queue and thread registry\n- `packages/coding-agent/src/prompts/memories/` — memory prompt templates\n- `packages/coding-agent/src/internal-urls/memory-protocol.ts` — `memory://` URL handler\n", "models.md": "# Model and Provider Configuration (`models.yml`)\n\nThis document describes how the coding-agent currently loads models, applies overrides, resolves credentials, and chooses models at runtime.\n\n## What controls model behavior\n\nPrimary implementation files:\n\n- `src/config/model-registry.ts` — loads built-in + custom models, provider overrides, runtime discovery, auth integration\n- `src/config/model-resolver.ts` — parses model patterns and selects initial/smol/slow models\n- `src/config/settings-schema.ts` — model-related settings (`modelRoles`, provider transport preferences)\n- `src/session/auth-storage.ts` — API key + OAuth resolution order\n- `packages/ai/src/models.ts` and `packages/ai/src/types.ts` — built-in providers/models and `Model`/`compat` types\n\n## Config file location and legacy behavior\n\nDefault config path:\n\n- `~/.omp/agent/models.yml`\n\nLegacy behavior still present:\n\n- If `models.yml` is missing and `models.json` exists at the same location, it is migrated to `models.yml`.\n- Explicit `.json` / `.jsonc` config paths are still supported when passed programmatically to `ModelRegistry`.\n\n## `models.yml` shape\n\n```yaml\nproviders:\n :\n # provider-level config\nequivalence:\n overrides:\n /: \n exclude:\n - /\n```\n\n`provider-id` is the canonical provider key used across selection and auth lookup.\n\n`equivalence` is optional and configures canonical model grouping on top of concrete provider models:\n\n- `overrides` maps an exact concrete selector (`provider/modelId`) to an official upstream canonical id\n- `exclude` opts a concrete selector out of canonical grouping\n\n## Provider-level fields\n\n```yaml\nproviders:\n my-provider:\n baseUrl: https://api.example.com/v1\n apiKey: MY_PROVIDER_API_KEY\n api: openai-completions\n headers:\n X-Team: platform\n authHeader: true\n auth: apiKey\n disableStrictTools: false # set true for Anthropic-compatible endpoints that reject the strict field\n discovery:\n type: ollama\n modelOverrides:\n some-model-id:\n name: Renamed model\n models:\n - id: some-model-id\n name: Some Model\n api: openai-completions\n reasoning: false\n input: [text]\n cost:\n input: 0\n output: 0\n cacheRead: 0\n cacheWrite: 0\n contextWindow: 128000\n maxTokens: 16384\n headers:\n X-Model: value\n compat:\n supportsStore: true\n supportsDeveloperRole: true\n supportsReasoningEffort: true\n maxTokensField: max_completion_tokens\n openRouterRouting:\n only: [anthropic]\n vercelGatewayRouting:\n order: [anthropic, openai]\n extraBody:\n gateway: m1-01\n controller: mlx\n```\n\n### Allowed provider/model `api` values\n\n- `openai-completions`\n- `openai-responses`\n- `openai-codex-responses`\n- `azure-openai-responses`\n- `anthropic-messages`\n- `google-generative-ai`\n- `google-vertex`\n\n### Allowed auth/discovery values\n\n- `auth`: `apiKey` (default), `none`, or `oauth`; for `models.yml` custom models, `oauth` is accepted by schema but does not waive the `apiKey` requirement\n- `discovery.type`: `ollama`, `llama.cpp`, or `lm-studio`\n\n## Validation rules (current)\n\n### Full custom provider (`models` is non-empty)\n\nRequired:\n\n- `baseUrl`\n- `apiKey` unless `auth: none`\n- `api` at provider level or each model\n\n### Override-only provider (`models` missing or empty)\n\nMust define at least one of:\n\n- `baseUrl`\n- `headers`\n- `compat`\n- `disableStrictTools`\n- `modelOverrides`\n- `discovery`\n\n### Discovery\n\n- `discovery` requires provider-level `api`.\n\n### Model value checks\n\n- `id` required\n- `contextWindow` and `maxTokens` must be positive if provided\n\n## Merge and override order\n\nModelRegistry pipeline (on refresh):\n\n1. Load built-in providers/models from `@oh-my-pi/pi-ai`.\n2. Load `models.yml` custom config.\n3. Apply provider overrides (`baseUrl`, `headers`, `disableStrictTools`) to built-in models.\n4. Apply `modelOverrides` (per provider + model id).\n5. Merge custom `models`:\n - same `provider + id` replaces existing\n - otherwise append\n6. Load cached/runtime-discovered models (Ollama, llama.cpp, LM Studio, plus built-in provider managers), then re-apply model overrides.\n\n### Provider-model cache and static fingerprint\n\nCached per-provider model lists are persisted in the model-cache SQLite\ndatabase (schema v3) with a `static_fingerprint` column that hashes the\nstatic catalog slice merged into the row. When `resolveProviderModels`\nskips the network fetch and the fingerprint of the in-memory static\ncatalog matches the cached one, the cached rows are returned verbatim —\nthe static + dynamic merge is bypassed entirely. The fingerprint is\nmemoized per process via a WeakMap keyed by the static-models array\nreference, so repeated cold-start calls do not re-hash.\n\n## Canonical model equivalence and coalescing\n\nThe registry keeps every concrete provider model and then builds a canonical layer above them.\n\nCanonical ids are official upstream ids only, for example:\n\n- `claude-opus-4-6`\n- `claude-haiku-4-5`\n- `gpt-5.3-codex`\n\n### `models.yml` equivalence config\n\nExample:\n\n```yaml\nproviders:\n zenmux:\n baseUrl: https://api.zenmux.example/v1\n apiKey: ZENMUX_API_KEY\n api: openai-codex-responses\n models:\n - id: codex\n name: Zenmux Codex\n reasoning: true\n input: [text]\n cost:\n input: 0\n output: 0\n cacheRead: 0\n cacheWrite: 0\n contextWindow: 200000\n maxTokens: 32768\n\nequivalence:\n overrides:\n zenmux/codex: gpt-5.3-codex\n p-codex/codex: gpt-5.3-codex\n exclude:\n - demo/codex-preview\n```\n\nBuild order for canonical grouping:\n\n1. exact user override from `equivalence.overrides`\n2. bundled official-id matches from built-in model metadata\n3. conservative heuristic normalization for gateway/provider variants\n4. fallback to the concrete model's own id\n\nCurrent heuristics are intentionally narrow:\n\n- embedded upstream prefixes can be stripped when present, for example `anthropic/...` or `openai/...`\n- dotted and dashed version variants can normalize only when they map to an existing official id, for example `4.6 -> 4-6`\n- ambiguous families or versions are not merged without a bundled match or explicit override\n\n### Canonical resolution behavior\n\nWhen multiple concrete variants share a canonical id, resolution uses:\n\n1. availability and auth\n2. `config.yml` `modelProviderOrder`\n3. existing registry/provider order if `modelProviderOrder` is unset\n\nDisabled or unauthenticated providers are skipped.\n\nSession state and transcripts continue to record the concrete provider/model that actually executed the turn.\n\nProvider defaults vs per-model overrides:\n\n- Provider `headers` are baseline.\n- Model `headers` override provider header keys.\n- `modelOverrides` can override model metadata (`name`, `reasoning`, `input`, `cost`, `contextWindow`, `maxTokens`, `headers`, `compat`, `contextPromotionTarget`).\n- `compat` is deep-merged for nested routing blocks (`openRouterRouting`, `vercelGatewayRouting`, `extraBody`).\n\n## Runtime discovery integration\n\n### Implicit Ollama discovery\n\nIf `ollama` is not explicitly configured, registry adds an implicit discoverable provider:\n\n- provider: `ollama`\n- api: `openai-responses`\n- base URL: `OLLAMA_BASE_URL` or `http://127.0.0.1:11434`\n- auth mode: keyless (`auth: none` behavior)\n\nRuntime discovery calls Ollama endpoints and normalizes discovered OpenAI-compatible models to `openai-responses`.\n\n### Implicit llama.cpp discovery\n\nIf `llama.cpp` is not explicitly configured, registry adds an implicit discoverable provider:\n\n- provider: `llama.cpp`\n- api: `openai-responses`\n- base URL: `LLAMA_CPP_BASE_URL` or `http://127.0.0.1:8080`\n- auth mode: keyless (`auth: none` behavior)\n\nRuntime discovery calls llama.cpp model endpoints and synthesizes model entries with local defaults.\n\n### Implicit LM Studio discovery\n\nIf `lm-studio` is not explicitly configured, registry adds an implicit discoverable provider:\n\n- provider: `lm-studio`\n- api: `openai-completions`\n- base URL: `LM_STUDIO_BASE_URL` or `http://127.0.0.1:1234/v1`\n- auth mode: keyless (`auth: none` behavior)\n\nRuntime discovery fetches models (`GET /models`) and synthesizes model entries with local defaults.\n\n### Explicit provider discovery\n\nYou can configure discovery yourself:\n\n```yaml\nproviders:\n ollama:\n baseUrl: http://127.0.0.1:11434\n api: openai-responses\n auth: none\n discovery:\n type: ollama\n\n llama.cpp:\n baseUrl: http://127.0.0.1:8080\n api: openai-responses\n auth: none\n discovery:\n type: llama.cpp\n```\n\n### Extension provider registration\n\nExtensions can register providers at runtime (`pi.registerProvider(...)`), including:\n\n- model replacement/append for a provider\n- custom stream handler registration for new API IDs\n- custom OAuth provider registration\n\n## Auth and API key resolution order\n\nWhen requesting a key for a provider, effective order is:\n\n1. Runtime override (CLI `--api-key`)\n2. Stored API key credential in `agent.db`\n3. Stored OAuth credential in `agent.db` (with refresh)\n4. Environment variable mapping (`OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, etc.)\n5. ModelRegistry fallback resolver (provider `apiKey` from `models.yml`, env-name-or-literal semantics)\n\n`models.yml` `apiKey` behavior:\n\n- Value is first treated as an environment variable name.\n- If no env var exists, the literal string is used as the token.\n\nIf `authHeader: true` and provider `apiKey` is set, models get:\n\n- `Authorization: Bearer ` header injected.\n\nKeyless providers:\n\n- Providers marked `auth: none` are treated as available without credentials.\n- `getApiKey*` returns `kNoAuth` for them.\n\n### Broker mode\n\nWhen `OMP_AUTH_BROKER_URL` (or `auth.broker.url`) is set, the local SQLite credential store is replaced by `RemoteAuthCredentialStore`. Layers 2 and 3 above (stored API key / OAuth in `agent.db`) are served from a broker-supplied snapshot whose `refresh` tokens are redacted; expiry triggers `POST /v1/credential/:id/refresh` on the broker rather than a local refresh.\n\n`AuthStorage.setConfigApiKey` lets a `models.yml` `apiKey` win over a broker-resolved OAuth token without overriding a runtime `--api-key`. See [`auth-broker-gateway.md`](./auth-broker-gateway.md) for the full broker / gateway design and env surface (`OMP_AUTH_BROKER_URL`, `OMP_AUTH_BROKER_TOKEN`, `auth.broker.url`, `auth.broker.token`).\n\n## Model availability vs all models\n\n- `getAll()` returns the loaded model registry (built-in + merged custom + discovered).\n- `getAvailable()` filters to models that are keyless or have resolvable auth.\n\nSo a model can exist in registry but not be selectable until auth is available.\n\n## Runtime model resolution\n\n### CLI and pattern parsing\n\n`model-resolver.ts` supports:\n\n- exact `provider/modelId`\n- exact canonical model id\n- exact model id (provider inferred)\n- fuzzy/substring matching\n- glob scope patterns in `--models` (e.g. `openai/*`, `*sonnet*`)\n- optional `:thinkingLevel` suffix (`off|minimal|low|medium|high|xhigh`)\n\n`--provider` is legacy; `--model` is preferred.\n\nResolution precedence for exact selectors:\n\n1. exact `provider/modelId` bypasses coalescing\n2. exact canonical id resolves through the canonical index\n3. exact bare concrete id still works\n4. fuzzy and glob matching run after the exact paths\n\n### Initial model selection priority\n\n`findInitialModel(...)` uses this order:\n\n1. explicit CLI provider+model\n2. first scoped model (if not resuming)\n3. saved default provider/model\n4. known provider defaults (e.g. OpenAI/Anthropic/etc.) among available models\n5. first available model\n\n### Role aliases and settings\n\nSupported model roles:\n\n- `default`, `smol`, `slow`, `vision`, `plan`, `designer`, `commit`, `task`\n\nRole aliases like `pi/smol` expand through `settings.modelRoles`. Each role value can also append a thinking selector such as `:minimal`, `:low`, `:medium`, or `:high`.\n\nIf a role points at another role, the target model still inherits normally and any explicit suffix on the referring role wins for that role-specific use.\n\nRelated settings:\n\n- `modelRoles` (record)\n- `enabledModels` (scoped pattern list)\n- `modelProviderOrder` (global canonical-provider precedence)\n- `providers.kimiApiFormat` (`openai` or `anthropic` request format)\n- `providers.openaiWebsockets` (`auto|off|on` websocket preference for OpenAI Codex transport)\n\n`modelRoles` may store either:\n\n- `provider/modelId` to pin a concrete provider variant\n- a canonical id such as `gpt-5.3-codex` to allow provider coalescing\n\nFor `enabledModels` and CLI `--models`:\n\n- exact canonical ids expand to all concrete variants in that canonical group\n- explicit `provider/modelId` entries stay exact\n- globs and fuzzy matches still operate on concrete models\n\nGlobal `enabledModels` and `disabledProviders` entries may also be scoped to a path prefix:\n\n```yaml\nenabledModels:\n - claude-sonnet-4-5\n - path: ~/work\n models:\n - anthropic/claude-opus-4-5\ndisabledProviders:\n - ollama\n - path: ~/private\n providers:\n - anthropic\n```\n\nString entries apply everywhere. Scoped entries apply when the current working directory is the configured path or one of its subdirectories. Use `path`, `paths`, `pathPrefix`, or `pathPrefixes`; use `models` for `enabledModels`, `providers` for `disabledProviders`, or `values` for either.\n\n## `/model` and `--list-models`\n\nBoth surfaces keep provider-prefixed models visible and selectable.\n\nThey now also expose canonical/coalesced models:\n\n- `/model` includes a canonical view alongside provider tabs\n- `--list-models` prints a canonical section plus the concrete provider rows\n\nSelecting a canonical entry stores the canonical selector. Selecting a provider row stores the explicit `provider/modelId`.\n\n## Context promotion (model-level fallback chains)\n\nContext promotion is an overflow recovery mechanism for small-context variants (for example `*-spark`) that automatically promotes to a larger-context sibling when the API rejects a request with a context length error.\n\n### Trigger and order\n\nWhen a turn fails with a context overflow error (e.g. `context_length_exceeded`), `AgentSession` attempts promotion **before** falling back to compaction:\n\n1. If `contextPromotion.enabled` is true, resolve a promotion target (see below).\n2. If a target is found, switch to it and retry the request — no compaction needed.\n3. If no target is available, fall through to auto-compaction on the current model.\n\n### Target selection\n\nSelection is model-driven, not role-driven:\n\n1. `currentModel.contextPromotionTarget` (if configured)\n2. smallest larger-context model on the same provider + API\n\nCandidates are ignored unless credentials resolve (`ModelRegistry.getApiKey(...)`).\n\n### OpenAI Codex websocket handoff\n\nIf switching from/to `openai-codex-responses`, session provider state key `openai-codex-responses` is closed before model switch. This drops websocket transport state so the next turn starts clean on the promoted model.\n\n### Persistence behavior\n\nPromotion uses temporary switching (`setModelTemporary`):\n\n- recorded as a temporary `model_change` in session history\n- does not rewrite saved role mapping\n\n### Configuring explicit fallback chains\n\nConfigure fallback directly in model metadata via `contextPromotionTarget`.\n\n`contextPromotionTarget` accepts either:\n\n- `provider/model-id` (explicit)\n- `model-id` (resolved within current provider)\n\nExample (`models.yml`) for Spark -> non-Spark on the same provider:\n\n```yaml\nproviders:\n openai-codex:\n modelOverrides:\n gpt-5.3-codex-spark:\n contextPromotionTarget: openai-codex/gpt-5.3-codex\n```\n\nThe built-in model generator also assigns this automatically for `*-spark` models when a same-provider base model exists.\n\n## Compatibility and routing fields\n\nThe `compat` block on a provider or model overrides the URL-based auto-detection in `packages/ai/src/providers/openai-completions-compat.ts`. It is validated by `OpenAICompatSchema` in `packages/coding-agent/src/config/model-registry.ts` and consumed by every `openai-completions` transport (`packages/ai/src/providers/openai-completions.ts`). The canonical type is `OpenAICompat` in `packages/ai/src/types.ts`.\n\n`models.yml` accepts the following keys (all optional; unset falls back to URL detection):\n\nRequest shaping:\n\n- `supportsStore` — emit `store: false` on requests. Default: auto (off for non-standard endpoints).\n- `supportsDeveloperRole` — use the `developer` system role for reasoning models instead of `system`. Default: auto.\n- `supportsUsageInStreaming` — send `stream_options: { include_usage: true }` to receive token usage on streaming responses. Default: `true`.\n- `maxTokensField` — `\"max_completion_tokens\"` or `\"max_tokens\"`. Default: auto.\n- `supportsToolChoice` — emit the `tool_choice` parameter when the caller forces a specific tool. Default: `true`. Set `false` for endpoints that 400 on `tool_choice` (e.g. DeepSeek when reasoning is on).\n- `disableReasoningOnForcedToolChoice` — drop `reasoning_effort` / OpenRouter `reasoning` whenever `tool_choice` forces a call. Default: auto (Kimi/Anthropic-fronted endpoints).\n- `extraBody` — extra top-level fields merged into every request body (gateway hints, controller selectors, etc.).\n\nReasoning / thinking:\n\n- `supportsReasoningEffort` — accept `reasoning_effort`. Default: auto (off for Grok and zAI).\n- `reasoningEffortMap` — partial map from internal effort levels (`minimal|low|medium|high|xhigh`) to provider-specific strings (e.g. DeepSeek maps `xhigh -> \"max\"`).\n- `thinkingFormat` — request shape for thinking: `\"openai\"` (`reasoning_effort`), `\"openrouter\"` (`reasoning: { effort }`), `\"zai\"` (`thinking: { type: \"enabled\" }`), `\"qwen\"` (top-level `enable_thinking`), or `\"qwen-chat-template\"` (`chat_template_kwargs.enable_thinking`). Default: `\"openai\"`.\n- `reasoningContentField` — assistant field carrying chain-of-thought: `\"reasoning_content\"`, `\"reasoning\"`, or `\"reasoning_text\"`. Default: auto.\n- `requiresReasoningContentForToolCalls` — assistant tool-call turns must round-trip the reasoning field (DeepSeek-R1, Kimi, OpenRouter when reasoning is on). Default: `false`.\n- `requiresAssistantContentForToolCalls` — assistant tool-call turns must include non-empty text content (Kimi). Default: `false`.\n\nTool / message normalization:\n\n- `requiresToolResultName` — tool-result messages need a `name` field (Mistral). Default: auto.\n- `requiresAssistantAfterToolResult` — a user message after a tool result needs an assistant turn in between. Default: auto.\n- `requiresThinkingAsText` — convert thinking blocks to text wrapped in `` delimiters (Mistral). Default: auto.\n- `requiresMistralToolIds` — normalize tool-call ids to exactly 9 alphanumeric chars. Default: auto.\n- `supportsStrictMode` — accept the per-tool `strict` field on tool schemas. Default: conservative auto-detect per provider/baseUrl.\n- `toolStrictMode` — `\"all_strict\"` forces strict on every tool, `\"none\"` forces it off; unset keeps the existing per-tool mixed behavior.\n\nGateway routing (only applied when `baseUrl` matches the gateway):\n\n- `openRouterRouting.only` / `openRouterRouting.order` — provider routing on `openrouter.ai` (see ).\n- `vercelGatewayRouting.only` / `vercelGatewayRouting.order` — provider routing on `ai-gateway.vercel.sh` (see ).\n\nProvider-level `compat` is the baseline; per-model `compat` is deep-merged on top, with `openRouterRouting`, `vercelGatewayRouting`, and `extraBody` merged as nested objects.\n\n### Anthropic compatibility (`anthropic-messages`)\n\nFor `anthropic-messages` models the runtime uses a separate `AnthropicCompat` shape (`packages/ai/src/types.ts`). The `models.yml` schema currently exposes only the strict-tools opt-out as a top-level provider field (see below); the remaining Anthropic-side knobs (`disableAdaptiveThinking`, `supportsEagerToolInputStreaming`, `supportsLongCacheRetention`) are set by built-in catalog metadata and are not user-configurable from `models.yml`.\n\n### Strict tool schemas (`disableStrictTools`)\n\nAnthropic's API supports a `strict` field on tool definitions that forces the model to always follow the provided schema exactly. This is enabled by default for all `anthropic-messages` providers because it guarantees schema conformance in agentic systems.\n\nThird-party providers that front the Anthropic API (AWS Bedrock, Azure, self-hosted proxies) do not always implement this field and will reject requests that include it. Set `disableStrictTools: true` at the provider level to opt out:\n\n```yaml\nproviders:\n bedrock-anthropic:\n baseUrl: https://bedrock-runtime.us-east-1.amazonaws.com/anthropic\n apiKey: AWS_BEARER_TOKEN\n api: anthropic-messages\n disableStrictTools: true\n models:\n - id: claude-sonnet-4-20250514\n name: Claude Sonnet 4 (Bedrock)\n input: [text, image]\n contextWindow: 200000\n maxTokens: 16384\n cost:\n input: 3.00\n output: 15.00\n cacheRead: 0.30\n cacheWrite: 3.75\n```\n\n`disableStrictTools` is a provider-level flag that applies to all models in the provider.\n\nTool schemas going on the wire are normalized by the unified flow in\n`packages/ai/src/utils/schema/normalize.ts` (Google/CCA/MCP dispatchers\nplus the OpenAI strict-mode sanitize+enforce pipeline). See\n[`ai-schema-normalize.md`](./ai-schema-normalize.md) for the strict-mode\nedge cases (local `$ref` inlining, single-item `allOf` collapse,\n`anyOf`-wrapper description hoist, enum/const primitive-type inference)\nand the per-provider dispatcher mapping.\n## Practical examples\n\n### Local OpenAI-compatible endpoint (no auth)\n\n```yaml\nproviders:\n local-openai:\n baseUrl: http://127.0.0.1:8000/v1\n auth: none\n api: openai-completions\n models:\n - id: Qwen/Qwen2.5-Coder-32B-Instruct\n name: Qwen 2.5 Coder 32B (local)\n```\n\n### Hosted proxy with env-based key\n\n```yaml\nproviders:\n anthropic-proxy:\n baseUrl: https://proxy.example.com/anthropic\n apiKey: ANTHROPIC_PROXY_API_KEY\n api: anthropic-messages\n authHeader: true\n disableStrictTools: true # if the proxy doesn't support strict tool schemas\n models:\n - id: claude-sonnet-4-20250514\n name: Claude Sonnet 4 (Proxy)\n reasoning: true\n input: [text, image]\n```\n\n### Override built-in provider route + model metadata\n\n```yaml\nproviders:\n openrouter:\n baseUrl: https://my-proxy.example.com/v1\n headers:\n X-Team: platform\n modelOverrides:\n anthropic/claude-sonnet-4:\n name: Sonnet 4 (Corp)\n compat:\n openRouterRouting:\n only: [anthropic]\n```\n\n## Legacy consumer caveat\n\nMost model configuration now flows through `models.yml` via `ModelRegistry`. Explicit `.json` / `.jsonc` paths remain supported only when passed programmatically to `ModelRegistry`; the default user config is `~/.omp/agent/models.yml`.\n\n## Failure mode\n\nIf `models.yml` fails schema or validation checks:\n\n- registry keeps operating with built-in models\n- error is exposed via `ModelRegistry.getError()` and surfaced in UI/notifications\n", "natives-addon-loader-runtime.md": "# Natives Addon Loader Runtime\n\nThis document covers the runtime loader shipped by `@oh-my-pi/pi-natives`: how `native/index.js` decides which `.node` file to require, how compiled-binary embedded payloads are extracted, and what startup failures report.\n\n## Implementation files\n\n- `packages/natives/native/index.js`\n- `packages/natives/native/loader-state.js`\n- `packages/natives/native/embedded-addon.js`\n- `packages/natives/scripts/embed-native.ts`\n- `packages/natives/package.json`\n\n## Scope and responsibility\n\nThe loader is intentionally narrow:\n\n- Build a platform/CPU-aware candidate list for addon filenames and directories.\n- Treat an embedded-addon manifest as the authoritative compiled-binary signal when present.\n- Optionally materialize an embedded addon into a versioned per-user cache directory.\n- Attempt candidates in deterministic order and return the first addon that `require(...)` loads.\n\nThe current loader does **not** run a separate `validateNative(...)` export-presence gate. API shape is provided by the generated N-API binding file (`native/index.d.ts`) and the loaded addon itself. A stale binary therefore normally fails as a missing property or native load error rather than as a custom \"missing exports\" validation error.\n\n## Runtime inputs and derived state\n\nAt module initialization, `native/index.js` computes:\n\n- **Platform tag**: `${process.platform}-${process.arch}` (for example `darwin-arm64`).\n- **Package version**: from `packages/natives/package.json`.\n- **Core directories**:\n - `nativeDir`: package-local `packages/natives/native`.\n - `execDir`: directory containing `process.execPath`.\n - `versionedDir`: `/`.\n - `userDataDir` fallback:\n - Windows: `%LOCALAPPDATA%/omp` or `%USERPROFILE%/AppData/Local/omp`.\n - Non-Windows: `~/.local/bin`.\n- **Natives cache root** (`getNativesDir()`):\n - if `$XDG_DATA_HOME/omp` exists, `$XDG_DATA_HOME/omp/natives`;\n - otherwise `~/.omp/natives`.\n- **Compiled-binary mode** (`detectCompiledBinary`): true if any of:\n - embedded-addon manifest is non-null,\n - `PI_COMPILED` env var is set,\n - `import.meta.url` contains Bun embedded markers (`$bunfs`, `~BUN`, `%7EBUN`).\n- **Variant override**: `PI_NATIVE_VARIANT` (`modern`/`baseline` only; invalid values ignored).\n- **Selected variant**: explicit override, otherwise runtime AVX2 detection on x64 (`modern` if AVX2, else `baseline`).\n\n## Platform support and tag resolution\n\n`SUPPORTED_PLATFORMS` is fixed to:\n\n- `linux-x64`\n- `linux-arm64`\n- `darwin-x64`\n- `darwin-arm64`\n- `win32-x64`\n\nUnsupported platforms are not rejected before probing. The loader first tries the computed candidate paths. If all fail and `platformTag` is unsupported, it throws an unsupported-platform error listing supported tags.\n\n## Variant selection (`modern` / `baseline` / default)\n\n### x64 behavior\n\n1. `PI_NATIVE_VARIANT=modern|baseline` wins when valid.\n2. Otherwise AVX2 support is detected:\n - Linux: scan `/proc/cpuinfo` for `avx2`.\n - macOS: `sysctl -n machdep.cpu.leaf7_features`, then `machdep.cpu.features`.\n - Windows: PowerShell `[System.Runtime.Intrinsics.X86.Avx2]::IsSupported`.\n3. AVX2 selects `modern`; unavailable or undetectable AVX2 selects `baseline`.\n\n### Non-x64 behavior\n\nNo variant suffix is used; the filename is `pi_natives.-.node`.\n\n### Filename construction\n\n`loader-state.js#getAddonFilenames` returns:\n\n- Non-x64 or no variant: `pi_natives..node`\n- x64 + `modern`:\n 1. `pi_natives.-modern.node`\n 2. `pi_natives.-baseline.node`\n 3. `pi_natives..node`\n- x64 + `baseline`:\n 1. `pi_natives.-baseline.node`\n 2. `pi_natives..node`\n\nThe default unsuffixed fallback remains part of the x64 candidate list.\n\n## Candidate path construction and fallback ordering\n\n`resolveLoaderCandidates(...)` expands every filename across directories, then de-duplicates while preserving first occurrence order.\n\n### Non-compiled runtime\n\nFor each filename, candidates are:\n\n1. `/`\n2. `/`\n\n### Compiled runtime\n\nFor each filename, candidates are:\n\n1. `/`\n2. `/`\n3. `/`\n4. `/`\n\nAt load time, an extracted embedded candidate, when produced, is prepended ahead of these de-duplicated candidates.\n\n## Embedded addon extraction lifecycle\n\n`embedded-addon.js` is generated by `scripts/embed-native.ts`. The reset stub exports `embeddedAddon = null`. A populated manifest has:\n\n- `platformTag`\n- `version`\n- `files[]` entries with `variant`, `filename`, and `filePath`\n\nExtraction (`maybeExtractEmbeddedAddon`) runs only when:\n\n1. compiled-binary mode is true,\n2. `embeddedAddon` is non-null,\n3. manifest `platformTag` equals the runtime platform tag,\n4. manifest `version` equals the package version,\n5. a variant-appropriate embedded file exists.\n\nVariant file selection:\n\n- Non-x64: prefer `default`, then first available file.\n- x64 + `modern`: prefer `modern`, fallback to `baseline`.\n- x64 + `baseline`: require `baseline`.\n\nMaterialization:\n\n1. Ensure `` exists.\n2. Reuse `/` if it already exists.\n3. Otherwise read `selectedEmbeddedFile.filePath` and write the target path.\n4. Return the target path as the first candidate.\n\nDirectory creation or write failures are appended to the loader error list; probing continues through normal candidates.\n\n## Lifecycle and state transitions\n\n```text\nInit\n -> Load package metadata and embedded-addon manifest\n -> Compute platform/version/variant/filenames/candidate paths\n -> (compiled + embedded manifest matches?)\n yes -> try extract to versionedDir (record errors, continue)\n no -> skip extraction\n -> For each runtime candidate in order:\n require(candidate)\n -> success: return addon exports (READY)\n -> failure: record error, continue\n -> none loaded:\n if unsupported platform tag -> throw Unsupported platform\n else -> throw Failed to load (tried-path diagnostics + hints)\n```\n\n## Failure behavior and diagnostics\n\n### Unsupported platform\n\nIf all candidates fail and `platformTag` is not supported, the loader throws:\n\n- `Unsupported platform: `\n- supported platform list\n- issue-reporting guidance\n\n### No loadable candidate\n\nIf the platform is supported but no candidate can be loaded, the final error includes:\n\n- `Failed to load pi_natives native addon for ` or ` ()`\n- every attempted path with the corresponding `require(...)` error\n- mode-specific remediation hints\n\n### Compiled-binary startup failures\n\nCompiled mode diagnostics include:\n\n- expected versioned cache target paths (`/`),\n- remediation to delete the versioned cache and rerun,\n- direct release download `curl` commands for each expected filename.\n\n### Non-compiled startup failures\n\nNormal package/runtime diagnostics include:\n\n- reinstall hint (`bun install @oh-my-pi/pi-natives`),\n- local rebuild command (`bun --cwd=packages/natives run build`),\n- optional x64 variant build hint (`TARGET_VARIANT=baseline|modern bun --cwd=packages/natives run build`).\n", "natives-architecture.md": "# Natives Architecture\n\n`@oh-my-pi/pi-natives` is now a two-layer package around a loader:\n\n1. **CommonJS loader/package entrypoint** resolves and loads the correct `.node` addon and patches generated enum objects onto the export object.\n2. **Rust N-API module layer** implements the exported functions/classes and emits the generated TypeScript declarations.\n\nThis document is the foundation for deeper module-level docs.\n\n## Implementation files\n\n- `packages/natives/native/index.js`\n- `packages/natives/native/index.d.ts`\n- `packages/natives/native/loader-state.js`\n- `packages/natives/native/embedded-addon.js`\n- `packages/natives/scripts/build-native.ts`\n- `packages/natives/scripts/embed-native.ts`\n- `packages/natives/scripts/gen-enums.ts`\n- `packages/natives/package.json`\n- `crates/pi-natives/src/lib.rs`\n\n## Package entrypoint and public surface\n\n`packages/natives/package.json` points directly at generated native bindings:\n\n- `main`: `./native/index.js`\n- `types`: `./native/index.d.ts`\n- `exports[\".\"].types`: `./native/index.d.ts`\n- `exports[\".\"].import`: `./native/index.js`\n\nThere is no current `packages/natives/src` TypeScript wrapper layer. Consumers import functions/classes/enums directly from `@oh-my-pi/pi-natives`; the type contract is the generated `native/index.d.ts` plus enum exports appended by `scripts/gen-enums.ts`.\n\nCurrent capability groups in the generated API include:\n\n- **Search/text/code primitives**: `grep`, `search`, `hasMatch`, `fuzzyFind`, `glob`, `astGrep`, `astEdit`, text width/slicing/wrapping/sanitization, syntax highlighting, token counting.\n- **Execution/process/terminal primitives**: `executeShell`, `Shell`, `PtySession`, process-tree helpers, key parsing.\n- **System/media/conversion primitives**: clipboard, image resize/encode/SIXEL, HTML-to-Markdown, macOS appearance/power helpers, work profiling, Windows ProjFS overlay helpers.\n\n## Loader layer\n\n`packages/natives/native/index.js` owns runtime addon selection and optional embedded extraction.\n\n### Candidate resolution model\n\n- Platform tag is `${process.platform}-${process.arch}`.\n- Supported tags are currently:\n - `linux-x64`\n - `linux-arm64`\n - `darwin-x64`\n - `darwin-arm64`\n - `win32-x64`\n- x64 can use CPU variants:\n - `modern` (AVX2-capable)\n - `baseline` (fallback)\n- Non-x64 uses the default filename without a variant suffix.\n\nFilename strategy:\n\n- Default: `pi_natives.-.node`\n- x64 variant: `pi_natives.--modern.node` or `...-baseline.node`\n- x64 runtime fallback includes the unsuffixed default filename after variant candidates.\n\n### Platform-specific variant detection\n\nFor x64, variant selection uses:\n\n- Linux: `/proc/cpuinfo`\n- macOS: `sysctl -n machdep.cpu.leaf7_features`, then `machdep.cpu.features`\n- Windows: PowerShell check for `System.Runtime.Intrinsics.X86.Avx2`\n\n`PI_NATIVE_VARIANT` can force `modern` or `baseline`; invalid values are ignored.\n\n### Binary distribution and extraction model\n\n`packages/natives/package.json` publishes `native/`, which contains the loader, generated declarations, generated enum patch, embedded-addon manifest stub, and prebuilt `.node` artifacts.\n\nFor compiled binaries, loader behavior is:\n\n1. Check versioned user cache path: `//...`.\n2. Check legacy compiled-binary location:\n - Windows: `%LOCALAPPDATA%/omp` (fallback `%USERPROFILE%/AppData/Local/omp`)\n - non-Windows: `~/.local/bin`\n3. Fall back to packaged `native/` and executable directory candidates.\n\n`getNativesDir()` uses `$XDG_DATA_HOME/omp/natives` when `$XDG_DATA_HOME/omp` exists; otherwise it uses `~/.omp/natives`.\n\nIf a populated embedded addon manifest is present, it is also treated as a compiled-binary signal. The loader can extract the matching embedded `.node` into the versioned cache directory before candidate probing.\n\n### Failure modes\n\nLoader failures are explicit:\n\n- **Unsupported platform tag**: after failed probing, throws with supported platform list.\n- **No loadable candidate**: throws with all attempted paths and remediation hints.\n- **Embedded extraction errors**: directory/write failures are recorded and included in final load diagnostics if no candidate loads.\n\nThe current loader does not perform a separate post-`require` export validation pass.\n\n## Rust N-API module layer\n\n`crates/pi-natives/src/lib.rs` declares exported module ownership:\n\n- `appearance`\n- `ast`\n- `clipboard`\n- `fd`\n- `fs_cache`\n- `glob`\n- `glob_util`\n- `grep`\n- `highlight`\n- `html`\n- `image`\n- `keys`\n- `language`\n- `power`\n- `prof`\n- `projfs_overlay`\n- `ps`\n- `pty`\n- `shell`\n- `task`\n- `text`\n- `tokens`\n- `utils` (crate-private helpers)\n\nN-API exports are generated from Rust `#[napi]` functions/classes/objects/enums. Snake_case Rust names are exposed as camelCase JavaScript names unless explicitly configured by napi-rs.\n\n## Ownership boundaries\n\n- **Loader/package ownership (`packages/natives/native`, `packages/natives/scripts`)**\n - runtime binary selection\n - CPU variant selection and override handling\n - compiled-binary embedded extraction\n - generated TypeScript declarations and enum export patching\n- **Rust ownership (`crates/pi-natives/src`)**\n - algorithmic and system-level implementation\n - platform-native behavior and performance-sensitive logic\n - N-API symbol implementation consumed directly by package callers\n- **Consumer ownership (`packages/coding-agent`, `packages/tui`)**\n - user-facing policy and fallbacks that are not built into the native API\n - higher-level rendering, artifact, shell-session, and command behavior\n\n## Runtime flow (high level)\n\n1. Consumer imports from `@oh-my-pi/pi-natives`.\n2. `native/index.js` computes platform/arch/variant and candidate paths.\n3. Optional embedded binary extraction occurs for compiled distributions.\n4. The first `require(candidate)` that succeeds becomes the exported addon object.\n5. Generated enum objects are appended to `module.exports`.\n6. Caller invokes generated N-API functions/classes directly.\n\n## Glossary\n\n- **Native addon**: A `.node` binary loaded via Node-API (N-API).\n- **Platform tag**: Runtime tuple `platform-arch` (for example `darwin-arm64`).\n- **Variant**: x64 CPU-specific build flavor (`modern` AVX2, `baseline` fallback).\n- **Generated binding declaration**: `native/index.d.ts` emitted by napi-rs during `build-native.ts`.\n- **Compiled binary mode**: Runtime mode where the CLI is bundled and native addons are resolved from embedded/cache paths before package-local paths.\n- **Embedded addon**: Build artifact metadata and file references generated into `native/embedded-addon.js` so compiled binaries can extract matching `.node` payloads.\n", "natives-binding-contract.md": "# Natives Binding Contract (JavaScript/TypeScript Side)\n\nThis document defines the JS/TS contract between `@oh-my-pi/pi-natives` callers and the loaded N-API addon.\n\nCurrent package shape is direct-to-native: there is no `packages/natives/src/` TypeScript wrapper layer. The public API is the generated `packages/natives/native/index.d.ts` declaration file, the CommonJS loader in `packages/natives/native/index.js`, and the Rust `#[napi]` exports in `crates/pi-natives/src`.\n\n## Implementation files\n\n- `packages/natives/native/index.js`\n- `packages/natives/native/index.d.ts`\n- `packages/natives/native/loader-state.js`\n- `packages/natives/scripts/build-native.ts`\n- `packages/natives/scripts/gen-enums.ts`\n- `packages/natives/package.json`\n- `crates/pi-natives/src/lib.rs`\n- Rust modules under `crates/pi-natives/src/*.rs`\n\n## Contract model\n\nThe contract has three parts:\n\n1. **Generated runtime loader** (`native/index.js`)\n - computes candidates and `require(...)`s the `.node` addon;\n - exports the loaded addon object directly;\n - appends enum objects generated by `scripts/gen-enums.ts`.\n2. **Generated TypeScript declarations** (`native/index.d.ts`)\n - generated by napi-rs during `scripts/build-native.ts`;\n - declares exported functions, classes, object interfaces, and native enums;\n - is the package `types` entry.\n3. **Rust N-API exports** (`crates/pi-natives/src`)\n - `#[napi]` functions/classes/objects/enums are the source of generated declarations and runtime symbols;\n - snake_case Rust names become camelCase JavaScript names by napi-rs convention.\n\nThere is no current `NativeBindings` declaration-merging lifecycle and no `validateNative(...)` required-export list in the loader.\n\n## Public export surface organization\n\n`packages/natives/package.json` exposes the package root only:\n\n```json\n{\n \"main\": \"./native/index.js\",\n \"types\": \"./native/index.d.ts\",\n \"exports\": {\n \".\": {\n \"types\": \"./native/index.d.ts\",\n \"import\": \"./native/index.js\"\n }\n }\n}\n```\n\nConsumers in `packages/coding-agent` and `packages/tui` import directly from `@oh-my-pi/pi-natives`.\n\n## JS API ↔ native export mapping (representative)\n\n| Category | Public JS API | Rust source | Return style |\n| ----------------- | ---------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------- | ----------------------------- |\n| Grep | `grep(options, onMatch?)` | `grep.rs` | `Promise` |\n| Grep | `search(content, options)` | `grep.rs` | `SearchResult` |\n| Grep | `hasMatch(content, pattern, ignoreCase?, multiline?)` | `grep.rs` | `boolean` |\n| Fuzzy path search | `fuzzyFind(options)` | `fd.rs` | `Promise` |\n| Glob | `glob(options, onMatch?)` | `glob.rs` | `Promise` |\n| Glob cache | `invalidateFsScanCache(path?)` | `fs_cache.rs` | `void` |\n| AST search/edit | `astGrep(options)`, `astEdit(options)` | `ast.rs` | `Promise<...>` |\n| Shell | `executeShell(options, onChunk?)` | `shell.rs` | `Promise` |\n| Shell | `new Shell(options?)`, `shell.run(...)`, `shell.abort()` | `shell.rs` | class / promises |\n| PTY | `new PtySession()`, `start/write/resize/kill` | `pty.rs` | class / promises |\n| Process | `killTree(pid, signal)`, `listDescendants(pid)` | `ps.rs` | sync |\n| Keys | `parseKey`, `matchesKey`, Kitty/legacy helpers | `keys.rs` | sync |\n| Text | `wrapTextWithAnsi`, `truncateToWidth`, `sliceWithWidth`, `extractSegments`, `visibleWidth` | `text.rs` | sync |\n| Highlight | `highlightCode`, `supportsLanguage`, `getSupportedLanguages` | `highlight.rs` | sync |\n| HTML | `htmlToMarkdown(html, options?)` | `html.rs` | `Promise` |\n| Image | `PhotonImage`, `encodeSixel` | `image.rs` | class / sync / promises |\n| Clipboard | `copyToClipboard`, `readImageFromClipboard` | `clipboard.rs` | sync / promise |\n| Tokens | `countTokens(input, encoding?)` | `tokens.rs` | sync |\n| System | `detectMacOSAppearance`, `MacAppearanceObserver`, `MacOSPowerAssertion`, `getWorkProfile`, ProjFS helpers | `appearance.rs`, `power.rs`, `prof.rs`, `projfs_overlay.rs` | mixed |\n\n## Sync vs async contract differences\n\nThe contract preserves Rust/N-API call style:\n\n- **Promise-returning exports** for worker-thread or async runtime work (`grep`, `glob`, `fuzzyFind`, `astGrep`, `astEdit`, `htmlToMarkdown`, shell/PTY runs, image parse/resize/encode, clipboard image read).\n- **Synchronous exports** for deterministic in-memory transforms/parsers or direct system calls (`search`, `hasMatch`, highlighting, text utilities, token counting, process queries, `copyToClipboard`, `encodeSixel`).\n- **Constructor exports** for stateful runtime objects (`Shell`, `PtySession`, `PhotonImage`, macOS observer/power handles).\n\nChanging sync ↔ async for an existing export is a breaking public API change because consumers call these exports directly.\n\n## Object and enum typing patterns\n\n### Object patterns\n\n`#[napi(object)]` Rust structs become TS interfaces, for example:\n\n- `GrepResult`, `SearchResult`, `GlobResult`, `FuzzyFindResult`\n- `ShellRunResult`, `ShellExecuteResult`, `PtyRunResult`, `MinimizerResult`\n- `AstFindResult`, `AstReplaceResult`\n- `System`/media payloads such as `ClipboardImage`, `WorkProfile`, `ParsedKittyResult`\n\nRuntime shape correctness is owned by napi-rs and the Rust implementation.\n\n### Enum patterns\n\nNative enums are represented in generated declarations and also appended to `module.exports` by `scripts/gen-enums.ts`, because the loader is hand-maintained CommonJS around the generated addon. Current enum objects include:\n\n- `AstMatchStrictness`\n- `Ellipsis`\n- `Encoding`\n- `FileType`\n- `GrepOutputMode`\n- `ImageFormat`\n- `KeyEventType`\n- `MacOSAppearance`\n- `SamplingFilter`\n\n## Error behavior and caveats\n\n- Addon load failure or unsupported platform throws during package import from `native/index.js`.\n- The loader does not verify the full export set after `require(...)`; stale or mismatched binaries surface as native load errors or missing members at use sites.\n- N-API conversion validates basic argument conversion, but TS optional fields do not guarantee semantic validity for untyped callers.\n- Numeric enum declarations do not prevent out-of-range numeric values from untyped callers unless the Rust function rejects them during conversion.\n- Callback exports use napi-rs `ThreadsafeFunction` shape: `(error: Error | null, value) => void`. Native code generally emits successful values; hard failures reject/throw through the owning call.\n\n## Maintainer checklist for binding changes\n\nWhen adding/changing an export, update all of:\n\n1. Rust `#[napi]` implementation in the owning `crates/pi-natives/src/.rs`.\n2. `crates/pi-natives/src/lib.rs` if a new module is added.\n3. Any consumer imports/callsites in `packages/coding-agent` or `packages/tui`.\n4. Build output by running the natives build so `native/index.d.ts` and `native/index.js` stay in sync.\n5. `scripts/gen-enums.ts` if enum runtime export patching needs to change.\n\nDo not add a parallel TS wrapper convention unless the package design intentionally moves back to wrappers; current consumers depend on the direct generated API.\n", "natives-build-release-debugging.md": "# Natives Build, Release, and Debugging Runbook\n\nThis runbook describes how `@oh-my-pi/pi-natives` produces `.node` addons, generated declarations, and compiled-binary embedded payloads, and how to debug loader/build failures.\n\nIt follows the architecture terms from `docs/natives-architecture.md`:\n\n- **build-time artifact production** (`scripts/build-native.ts`)\n- **embedded addon manifest generation** (`scripts/embed-native.ts`)\n- **runtime addon loading** (`native/index.js`, `native/loader-state.js`)\n\n## Implementation files\n\n- `packages/natives/scripts/build-native.ts`\n- `packages/natives/scripts/embed-native.ts`\n- `packages/natives/scripts/gen-enums.ts`\n- `packages/natives/package.json`\n- `packages/natives/native/index.js`\n- `packages/natives/native/loader-state.js`\n- `crates/pi-natives/Cargo.toml`\n\n## Build pipeline overview\n\n### 1) Build entrypoints\n\n`packages/natives/package.json` scripts:\n\n- `bun scripts/build-native.ts` (`build`) → N-API build, addon install, generated declarations install, enum export patch.\n- `bun scripts/embed-native.ts` (`embed:native`) → generate `native/embedded-addon.js` from built files.\n\nRoot scripts include `build:native` as `bun --cwd=packages/natives run build`.\n\n### 2) N-API/Rust artifact build\n\n`build-native.ts` invokes the `@napi-rs/cli` binary directly from `node_modules/.bin` with:\n\n- `napi build`\n- `--manifest-path crates/pi-natives/Cargo.toml`\n- `--package-json-path packages/natives/package.json`\n- `--platform`\n- `--no-js`\n- `--dts index.d.ts`\n- `--profile local` for non-CI local native builds, otherwise `--profile ci`\n- optional `--target `\n\n`crates/pi-natives/Cargo.toml` declares `crate-type = [\"cdylib\"]`; napi-rs emits `.node` artifacts plus generated `index.d.ts` in an isolated temporary output directory under `packages/natives/native/.build/`.\n\n### 3) Artifact install\n\nAfter napi-rs succeeds, `build-native.ts`:\n\n1. resolves the built addon in the isolated output directory;\n2. normalizes its name to `pi_natives.-(-variant).node` when needed;\n3. installs the addon into `packages/natives/native/` with temp-file + rename semantics;\n4. copies generated `index.js` and `index.d.ts` into `packages/natives/native/` when present;\n5. runs `generateEnumExports()` to append enum runtime objects to `native/index.js`.\n\nWindows locked-DLL replacement failures are reported with an explicit close-running-processes hint.\n\n## Target/variant model and naming conventions\n\n## Platform tag\n\nBoth build and runtime use platform tag:\n\n`-` (example: `darwin-arm64`, `linux-x64`).\n\n## Variant model (x64 only)\n\nx64 supports CPU variants:\n\n- `modern` (AVX2-capable path)\n- `baseline` (fallback)\n\nNon-x64 uses a single default artifact with no variant suffix.\n\n### Output filenames\n\n- x64: `pi_natives.--modern.node` or `...-baseline.node`\n- non-x64: `pi_natives.-.node`\n\nRuntime x64 candidate order also includes the unsuffixed default filename after the selected variant candidates.\n\n## Environment flags and build options\n\n## Runtime flags\n\n- `PI_NATIVE_VARIANT`: x64 runtime override; valid values are `modern` and `baseline`.\n- `PI_COMPILED`: legacy compiled-mode signal. A populated embedded-addon manifest is also a compiled-mode signal and is the authoritative signal for Bun standalone builds that do not preserve `process.env.PI_COMPILED`.\n\n## Build-time flags/options\n\n- `CROSS_TARGET`: passed to napi-rs as `--target `.\n- `TARGET_PLATFORM`: override output platform tag naming.\n- `TARGET_ARCH`: override output arch naming.\n- `TARGET_VARIANT` (x64 only): force `modern` or `baseline` for output filename and RUSTFLAGS policy.\n- `CARGO_TARGET_DIR`: respected if set; otherwise the default `target/` dir is used so `Swatinem/rust-cache` can cache cleanly.\n- `RUSTFLAGS`:\n - if unset and not cross-compiling, script sets:\n - modern: `-C target-cpu=x86-64-v3`\n - baseline: `-C target-cpu=x86-64-v2`\n - non-x64 / no variant: `-C target-cpu=native`\n - if already set, script does not override.\n\n## Build state/lifecycle transitions\n\n### Build lifecycle (`build-native.ts`)\n\n1. **Init**: parse env, resolve target tuple, cross/local mode, profile label.\n2. **Variant resolve**:\n - non-x64 → no variant;\n - x64 + `TARGET_VARIANT` → explicit variant;\n - x64 cross-build without `TARGET_VARIANT` → hard error;\n - x64 local build without override → detect host AVX2.\n3. **CPU policy**: set `RUSTFLAGS` for the resolved variant unless the caller already provided one.\n4. **Compile**: run napi-rs against `crates/pi-natives` into an isolated output directory.\n5. **Locate artifact**: accept the canonical filename or a single napi-rs-generated `pi_natives.-*.node` candidate.\n6. **Install**: copy/rename addon into `packages/natives/native`.\n7. **Install generated bindings**: copy `index.js`/`index.d.ts` if needed.\n8. **Patch enums**: append generated enum runtime exports.\n9. **Cleanup**: remove the temporary build output directory.\n\nFailure exits have explicit error text for invalid variants, failed napi build, missing/multiple output artifacts, generated binding install failure, and install/rename failure.\n\n### Embed lifecycle (`embed-native.ts`)\n\n1. **Init**: compute platform tag from `TARGET_PLATFORM`/`TARGET_ARCH` or host values.\n2. **Candidate set**:\n - x64 looks for `modern` and `baseline` files;\n - non-x64 looks for one default file.\n3. **Validate availability**: at least one expected file must exist in `packages/natives/native`.\n4. **Generate manifest** (`native/embedded-addon.js`) with Bun `file` imports and package version.\n5. **Runtime extraction ready** for compiled mode.\n\n`--reset` writes the null manifest stub (`embeddedAddon = null`) without validating addon availability.\n\n## Dev workflow vs shipped/compiled behavior\n\n## Local development workflow\n\nTypical local loop:\n\n1. Build addon: `bun --cwd=packages/natives run build`.\n2. Loader resolves package-local `native/` candidates, then executable-dir fallback candidates.\n3. Generated declarations in `native/index.d.ts` describe the public TS API.\n\n## Shipped/compiled binary workflow\n\nIn compiled mode (`PI_COMPILED`, Bun embedded URL markers, or populated embedded manifest):\n\n1. Loader computes versioned cache dir: `/`.\n2. If embedded manifest matches current platform+version, loader may extract the selected embedded file into that versioned dir.\n3. Runtime candidate order includes:\n - versioned cache dir,\n - legacy compiled-binary dir (`%LOCALAPPDATA%/omp` on Windows, `~/.local/bin` elsewhere),\n - package/executable directories.\n4. First successfully loaded addon is returned.\n\nThis is why packaging + runtime loader expectations must align: filenames, platform tags, CPU variants, and embedded manifest version must match what `native/index.js` probes.\n\n## JS API ↔ Rust export mapping (build sanity subset)\n\nGenerated declarations currently include exports from these Rust modules:\n\n| Area | Representative JS exports | Rust source |\n| ---------------------- | ------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------- |\n| Search | `grep`, `search`, `hasMatch`, `fuzzyFind`, `glob`, `invalidateFsScanCache` | `grep.rs`, `fd.rs`, `glob.rs`, `fs_cache.rs` |\n| AST | `astGrep`, `astEdit` | `ast.rs` |\n| Text/highlight/tokens | `visibleWidth`, `truncateToWidth`, `highlightCode`, `countTokens` | `text.rs`, `highlight.rs`, `tokens.rs` |\n| Shell/PTY/process/keys | `executeShell`, `Shell`, `PtySession`, `killTree`, `parseKey` | `shell.rs`, `pty.rs`, `ps.rs`, `keys.rs` |\n| Media/system | `PhotonImage`, `encodeSixel`, clipboard, macOS appearance/power, `getWorkProfile`, ProjFS helpers | `image.rs`, `clipboard.rs`, `appearance.rs`, `power.rs`, `prof.rs`, `projfs_overlay.rs` |\n\n## Failure behavior and diagnostics\n\n## Build-time failures\n\n- Invalid variant configuration:\n - `TARGET_VARIANT` set on non-x64 → immediate error.\n - unsupported `TARGET_VARIANT` value → immediate error.\n - x64 cross-build without explicit `TARGET_VARIANT` → immediate error.\n- napi-rs build failure: script surfaces non-zero exit and stderr.\n- Artifact not found or ambiguous: script prints expected/candidate filenames and output directory contents.\n- Install failure: explicit message; Windows includes locked-file hint.\n- Generated binding install failure: explicit source/destination message.\n\n## Runtime loader failures (`native/index.js`)\n\n- Unsupported platform tag: throws with supported platform list after probing fails.\n- No candidate could load: throws with full candidate error list and mode-specific remediation hints.\n- Embedded extraction problems: extraction mkdir/write errors are recorded and included in final diagnostics if load fails.\n\n## Troubleshooting matrix\n\n| Symptom | Likely cause | Verify | Fix |\n| ---------------------------------------------------------------------- | ------------------------------------------------------------------------------------------- | ----------------------------------------------------------------- | --------------------------------------------------------------------------------------------- |\n| `Cannot find module` or dynamic library load error for every candidate | Missing release artifact, wrong platform tag, or stale compiled cache | Inspect loader error list and `packages/natives/native` filenames | Build correct target/variant; delete stale cache for the package version |\n| Export is missing at runtime but present in TypeScript | Stale `.node` loaded, generated declarations newer than binary, or Rust export not compiled | Require the actual candidate and inspect `Object.keys(mod)` | Rebuild native package and remove stale candidate/cache paths |\n| x64 machine loads baseline when modern expected | `PI_NATIVE_VARIANT=baseline`, no AVX2 detected, or modern file unavailable | Check env and filenames in `native/` | Build modern variant (`TARGET_VARIANT=modern ... build`) and ship it |\n| Cross-build produces wrong-labeled binary | Mismatch between `CROSS_TARGET` and `TARGET_PLATFORM`/`TARGET_ARCH`, or missing x64 variant | Confirm env tuple and output filename | Re-run with consistent env values and explicit x64 `TARGET_VARIANT` |\n| Compiled binary fails after upgrade | Stale extracted cache or embedded manifest version mismatch | Inspect `/` and loader error list | Delete versioned cache for the package version; regenerate embedded manifest during packaging |\n| `embed:native` fails with `No native addons found` | Required platform artifact was not built before embedding | Check expected list in error text | Build at least one expected artifact for the target, then rerun `embed:native` |\n\n## Operational commands\n\n```bash\n# Release artifact for current host\nbun --cwd=packages/natives run build\n\n# Build explicit x64 variants\nTARGET_VARIANT=modern bun --cwd=packages/natives run build\nTARGET_VARIANT=baseline bun --cwd=packages/natives run build\n\n# Generate embedded addon manifest from built native files\nbun --cwd=packages/natives run embed:native\n\n# Reset embedded manifest to null stub\nbun --cwd=packages/natives run embed:native -- --reset\n```\n\n## Orchestrator-side content-addressed build cache (robomp)\n\nWhen `pi-natives` is built inside the robomp orchestrator (`python/robomp/`), workspaces share built artifacts through a content-addressed cache instead of rebuilding from scratch in every per-issue worktree. The cache is **orchestrator-side only** — `bun --cwd=packages/natives run build` itself is unchanged; the cache lives outside the build pipeline and is populated/captured around `ensure_workspace` and post-task success in `python/robomp/src/natives_cache.py`.\n\n### What is cached\n\nThe complete set of files in `packages/natives/native/` that are pure functions of the cache-key inputs:\n\n- `pi_natives.-[-variant].node` (glob `pi_natives.*.node`)\n- `index.d.ts`\n- `index.js`\n- `embedded-addon.js`\n- `manifest.json` (cache metadata: key, target triple, capture timestamp, source workspace, commit)\n\nAn entry is only considered a hit when the `.node` glob matches AND every companion plus the manifest is present. Partial entries are evicted on GC.\n\n### Cache key\n\nThe key is `sha256` over `(path \\t git-tree-hash \\n)` pairs for the following inputs, in this order (order is significant), followed by the target triple:\n\n1. `crates` (whole subtree — pi-natives transitively depends on other workspace crates)\n2. `Cargo.lock`\n3. `Cargo.toml`\n4. `rust-toolchain.toml`\n5. `packages/natives` (whole subtree — build script, `scripts/*`, package.json with napi config)\n\nTree hashes come from one `git cat-file --batch-check` invocation against `HEAD`; paths missing from `HEAD` fold in as a fixed null hash so the key stays deterministic across repos that don't ship every input. The target-triple suffix matches the napi addon basename convention (`-` for non-x64, `--` for x64). When `TARGET_VARIANT` is unset on an x64 host the variant component is `host` rather than autodetected — the key is stable on a given machine but a `modern`/`baseline` build with an explicit `TARGET_VARIANT` gets a different key.\n\nAnything outside this input set (Rust toolchain auto-installed delta, host glibc, env vars other than `TARGET_VARIANT`) is **not** in the key. If you need to invalidate after such a change, delete the cache directory by hand or bump one of the input files.\n\n### Layout and ownership\n\n- Root: `/data/cache/pi-natives` (provisioned by `entrypoint.sh` alongside the cargo caches, owned `root:omp`, mode `02770` setgid so cached files inherit `gid=omp` and stay readable by every slot user).\n- Per-repo subdirectory: `//` where the slug is `owner__repo` (mirrors `SandboxManager.pool_path`).\n- Per-entry directory: `///` containing the cached files plus `manifest.json`.\n- Per-repo lockfile: `//.lock` (advisory `fcntl.flock`, exclusive on capture and GC).\n- Staging dirs (`..tmp.`) during capture; renamed atomically into the final entry path. Stale staging dirs from crashed captures are swept on GC.\n\n### Populate and capture semantics\n\n- **Populate** (workspace ← cache) runs inside `ensure_workspace`. On a key hit the `.node` is **hardlinked** into the workspace (zero-copy, shared inode); the companion `index.d.ts` / `index.js` / `embedded-addon.js` are **copied** (independent inodes) because the napi build's `installGeneratedBindings` and `gen-enums.ts` rewrite those files via `open(..., 'w')` — an in-place truncate that would otherwise propagate through a hardlink and corrupt the cache. Cross-device hardlink failures (`EXDEV`) fall back to copy.\n- **Capture** (cache ← workspace) runs from the post-task success path when the build produced a complete artifact set. Capture uses **copy**, not hardlink: hardlinking a slot-owned workspace file would preserve slot UID ownership on the cached inode and defeat the shared-group model. Copying creates a fresh root-owned, `gid=omp` inode via the setgid cache root. Capture is idempotent under the per-repo flock: a concurrent capture for the same key returns the existing entry.\n\n### Garbage collection\n\nA periodic GC loop runs in `WorkerPool` with two caps per repo. When either cap is exceeded, oldest entries (by `manifest.json.captured_at`) are dropped first:\n\n- entry count cap (`max_entries_per_repo`, default 8)\n- byte cap (`max_bytes`, default 4 GiB)\n\nWorkspaces that hardlinked a `.node` before GC retain access via the kernel inode refcount — `rmtree` of the cache entry does not delete the file from the workspace.\n\n### Configuration (settings on `robomp.config.Settings`)\n\n| Env var | Default | Effect |\n| -------------------------------------------- | ------------------------ | ------------------------------------------------------------- |\n| `ROBOMP_NATIVES_CACHE_ENABLED` | `true` | Master switch. When false the populate/capture hooks no-op and every workspace builds from scratch. |\n| `ROBOMP_NATIVES_CACHE_ROOT` | `/data/cache/pi-natives` | Cache root directory. Must be `root:omp 02770` for cross-slot reads. |\n| `ROBOMP_NATIVES_CACHE_MAX_ENTRIES_PER_REPO` | `8` | LRU entry-count cap, per repo slug. |\n| `ROBOMP_NATIVES_CACHE_MAX_BYTES` | `4294967296` (4 GiB) | LRU byte cap, per repo slug. |\n| `ROBOMP_NATIVES_CACHE_GC_INTERVAL_SECONDS` | `3600` | Period of the background GC loop in `WorkerPool`. |\n\n### Manual invalidation\n\n- One key: `rm -rf /data/cache/pi-natives//`.\n- One repo: `rm -rf /data/cache/pi-natives/`.\n- Everything: `rm -rf /data/cache/pi-natives/*` (preserve the root so its setgid mode survives).\n- Stuck lock: `rm /data/cache/pi-natives//.lock` (only when no orchestrator process is touching the repo).\n\nTrigger an automatic miss by editing any path in the key set: a single touched byte under `crates/`, `Cargo.lock`, `Cargo.toml`, `rust-toolchain.toml`, or `packages/natives/` shifts the tree hash and forces a fresh build at the next populate.\n", "natives-media-system-utils.md": "# Natives media + system utilities\n\nThis document covers the media/system/conversion exports in `@oh-my-pi/pi-natives`: image processing, HTML conversion, clipboard access, token counting, macOS appearance/power helpers, ProjFS helpers, and work profiling.\n\n## Implementation files\n\n- `crates/pi-natives/src/image.rs`\n- `crates/pi-natives/src/html.rs`\n- `crates/pi-natives/src/clipboard.rs`\n- `crates/pi-natives/src/tokens.rs`\n- `crates/pi-natives/src/appearance.rs`\n- `crates/pi-natives/src/power.rs`\n- `crates/pi-natives/src/projfs_overlay.rs`\n- `crates/pi-natives/src/prof.rs`\n- `crates/pi-natives/src/task.rs`\n- `packages/natives/native/index.d.ts`\n\n> Note: there is no `crates/pi-natives/src/work.rs`; work profiling is implemented in `prof.rs` and fed by instrumentation in `task.rs`.\n\n## JS API ↔ Rust export/module mapping\n\n| JS export | Rust N-API export | Rust module |\n| --------------------------------------------------- | ------------------------------ | ------------------- |\n| `PhotonImage.parse(bytes)` | `PhotonImage::parse` | `image.rs` |\n| `PhotonImage#resize(width, height, filter)` | `PhotonImage::resize` | `image.rs` |\n| `PhotonImage#encode(format, quality)` | `PhotonImage::encode` | `image.rs` |\n| `encodeSixel(bytes, targetWidthPx, targetHeightPx)` | `encode_sixel` | `image.rs` |\n| `htmlToMarkdown(html, options?)` | `html_to_markdown` | `html.rs` |\n| `copyToClipboard(text)` | `copy_to_clipboard` | `clipboard.rs` |\n| `readImageFromClipboard()` | `read_image_from_clipboard` | `clipboard.rs` |\n| `countTokens(input, encoding?)` | `count_tokens` | `tokens.rs` |\n| `detectMacOSAppearance()` | `detect_mac_os_appearance` | `appearance.rs` |\n| `MacAppearanceObserver.start(callback)` | `MacAppearanceObserver::start` | `appearance.rs` |\n| `MacOSPowerAssertion.start(options?)` | `MacOSPowerAssertion::start` | `power.rs` |\n| `projfsOverlayProbe/start/stop` | ProjFS exports | `projfs_overlay.rs` |\n| `getWorkProfile(lastSeconds)` | `get_work_profile` | `prof.rs` |\n\n## Data format boundaries and conversions\n\n### Image (`image`)\n\n- **JS input boundary**: `Uint8Array` encoded image bytes for `PhotonImage.parse` and `encodeSixel`.\n- **Rust decode boundary**: bytes are copied/read, format is guessed with `ImageReader::with_guessed_format()`, then decoded to `DynamicImage`.\n- **In-memory state**: `PhotonImage` stores `Arc`.\n- **Output boundary**:\n - `PhotonImage#encode(format, quality)` returns a promise for encoded bytes (`Vec` in Rust; generated TS currently declares `Promise>`).\n - `encodeSixel(...)` returns a SIXEL escape string synchronously.\n\nFormat IDs:\n\n- `0`: PNG\n- `1`: JPEG\n- `2`: WebP\n- `3`: GIF\n\nEncoding behavior:\n\n- JPEG uses the provided `quality` with `JpegEncoder::new_with_quality`.\n- WebP uses the `webp` crate encoder with `quality` as `f32` in the same 0..=100 range.\n- PNG/GIF ignore `quality`.\n- Invalid dimensions for SIXEL (`0` width or height) fail with `Target SIXEL dimensions must be greater than zero`.\n\n### HTML conversion (`html`)\n\n- **JS input boundary**: HTML `string` + optional `{ cleanContent?: boolean; skipImages?: boolean }`.\n- **Rust conversion boundary**: conversion is scheduled through `task::blocking(\"html_to_markdown\", (), ...)`.\n- **Output boundary**: Markdown `string` promise.\n\nConversion behavior:\n\n- `cleanContent` defaults to `false`.\n- When `cleanContent=true`, preprocessing uses `PreprocessingPreset::Aggressive` and hard-removal flags for navigation/forms.\n- `skipImages` defaults to `false`.\n\n### Clipboard (`clipboard`)\n\n- `copyToClipboard(text)` is a synchronous native call using `arboard::Clipboard::set_text`.\n- `readImageFromClipboard()` runs in `task::blocking(\"clipboard.read_image\", (), ...)`.\n- Image read returns `null`/`undefined` when `arboard` reports `ContentNotAvailable`.\n- Successful image read re-encodes clipboard RGBA data as PNG and returns `{ data: Uint8Array, mimeType: \"image/png\" }`.\n- Clipboard access or image encoding failures reject/throw as native errors.\n\nThere is no current `packages/natives` TS wrapper that emits OSC52, handles Termux, or suppresses native clipboard failures. Any best-effort clipboard policy must live in consumers.\n\n### Tokens (`tokens`)\n\n- `countTokens(input, encoding?)` accepts a single string or an array of strings.\n- Arrays return one aggregate token count; encoding work is parallelized in Rust.\n- Default encoding is `O200kBase`; `Cl100kBase` is also exported.\n- The implementation uses ordinary encoding, not special-token handling.\n\n### macOS appearance and power helpers\n\n- `detectMacOSAppearance()` returns `\"dark\"`, `\"light\"`, or `null` on non-macOS.\n- `MacAppearanceObserver.start(callback)` returns a handle with `stop()`; on macOS it uses distributed notifications plus a 2-second polling fallback, and on non-macOS it is a no-op observer.\n- `MacOSPowerAssertion.start(options?)` returns a handle with `stop()`; on macOS it acquires an IOKit assertion, and on other platforms it is a no-op handle.\n\n### Windows ProjFS helpers\n\n- `projfsOverlayProbe()` reports whether ProjFS APIs are available.\n- `projfsOverlayStart(lowerRoot, projectionRoot)` starts an overlay.\n- `projfsOverlayStop(projectionRoot)` stops an overlay session.\n\nThese helpers are platform-specific; availability must be checked before relying on overlay behavior.\n\n### Work profiling (`work`)\n\n- **Collection boundary**: profiling samples are produced by `profile_region(tag)` guards in `task::blocking` and `task::future`.\n- **Storage format**: fixed-size circular buffer (`MAX_SAMPLES = 10_000`) storing stack path, duration, and timestamp.\n- **Output boundary**: `getWorkProfile(lastSeconds)` returns:\n - `folded`: folded-stack text (flamegraph input)\n - `summary`: markdown table summary\n - `svg`: optional flamegraph SVG\n - `totalMs`, `sampleCount`\n\n## Lifecycle and state transitions\n\n### Image lifecycle\n\n1. `PhotonImage.parse(bytes)` schedules a blocking decode task (`image.decode`).\n2. On success, a native `PhotonImage` handle exists in JS.\n3. `resize(...)` creates a new native handle (`image.resize`); old and new handles can coexist.\n4. `encode(...)` schedules `image.encode` and materializes bytes without mutating image dimensions.\n5. `encodeSixel(...)` decodes, optionally resizes to exact target dimensions with Lanczos3, and returns SIXEL text synchronously.\n\nFailure transitions:\n\n- Format detection/decode failure rejects parse promise or throws from SIXEL encoding.\n- Encode failure rejects encode promise.\n- Invalid SIXEL dimensions throw.\n\n### HTML lifecycle\n\n1. `htmlToMarkdown(html, options)` schedules a blocking conversion task.\n2. Conversion runs with defaulted options (`cleanContent=false`, `skipImages=false`) unless specified.\n3. Returns markdown string or rejects.\n\n### Clipboard lifecycle\n\n- Text copy constructs an `arboard::Clipboard` and calls `set_text` synchronously.\n- Image read constructs an `arboard::Clipboard`, calls `get_image`, encodes PNG on success, maps `ContentNotAvailable` to `None`, and rejects other errors.\n\n### Work profiling lifecycle\n\n1. No explicit start: profiling is active when task helpers execute.\n2. Every instrumented task scope records one sample on guard drop.\n3. Samples overwrite oldest entries after buffer capacity is reached.\n4. `getWorkProfile(lastSeconds)` reads a time window and derives folded/summary/svg artifacts.\n\nFailure transitions:\n\n- SVG generation failure is soft (`svg` omitted/undefined), while folded and summary still return.\n- Empty sample windows return empty folded data and no SVG, not an error.\n\n## Unsupported operations and error propagation\n\n### Image\n\n- Unsupported decode input or corrupted bytes: strict failure.\n- Invalid SIXEL target dimensions: strict failure.\n- No JS fallback path in the natives package.\n\n### HTML\n\n- Conversion errors are strict failures.\n- Option omission is defaulting, not failure.\n\n### Clipboard\n\n- Text copy is strict at the native API surface.\n- Image read distinguishes \"no image\" (`null`/`undefined`) from operational failure (rejection).\n\n### Work profiling\n\n- Retrieval is strict for the function call itself.\n- Flamegraph SVG generation is nullable/optional.\n- Buffer truncation is expected ring-buffer behavior.\n\n## Platform caveats\n\n- Clipboard access depends on OS/session support exposed through `arboard`.\n- macOS appearance and power helpers intentionally return no-op/null behavior on unsupported platforms.\n- ProjFS helpers are Windows-specific and should be gated by `projfsOverlayProbe()`.\n", "natives-rust-task-cancellation.md": "# Native Rust task execution and cancellation (`pi-natives`)\n\nThis document describes how `crates/pi-natives` schedules native work and how cancellation flows from JS options (`timeoutMs`, `AbortSignal`) into Rust execution.\n\n## Implementation files\n\n- `crates/pi-natives/src/task.rs`\n- `crates/pi-natives/src/grep.rs`\n- `crates/pi-natives/src/glob.rs`\n- `crates/pi-natives/src/fd.rs`\n- `crates/pi-natives/src/ast.rs`\n- `crates/pi-natives/src/shell.rs`\n- `crates/pi-natives/src/pty.rs`\n- `crates/pi-natives/src/html.rs`\n- `crates/pi-natives/src/image.rs`\n- `crates/pi-natives/src/clipboard.rs`\n- `crates/pi-natives/src/text.rs`\n- `crates/pi-natives/src/ps.rs`\n\n## Core primitives (`task.rs`)\n\n`task.rs` defines:\n\n1. `task::blocking(tag, cancel_token, work)`\n - Wraps `napi::AsyncTask` / `Task`.\n - `compute()` runs on libuv worker threads.\n - Returns a JS `Promise` for exported functions.\n - Records a profiling sample through `profile_region(tag)`.\n\n2. `task::future(env, tag, work)`\n - Wraps `env.spawn_future(...)`.\n - Runs async work on Tokio's runtime.\n - Returns `PromiseRaw<'env, T>`.\n - Records a profiling sample through `profile_region(tag)`.\n\n3. `CancelToken` / `AbortToken` / `AbortReason`\n - `CancelToken::new(timeout_ms, signal)` combines an optional deadline and optional JS `AbortSignal` converted from `Unknown`.\n - `CancelToken::heartbeat()` is cooperative cancellation for blocking loops.\n - `CancelToken::wait()` asynchronously waits for signal, timeout, or Ctrl-C.\n - `CancelToken::emplace_abort_token()` creates an abortable flag when a later `Shell.abort()`/internal bridge needs one.\n - `AbortToken::abort(reason)` lets external code request abort.\n\n## `blocking` vs `future`: execution model and selection\n\n### Use `task::blocking`\n\nUse when work is CPU-heavy or fundamentally synchronous/blocking:\n\n- regex/file scanning (`grep`, `glob`, `fuzzyFind`)\n- ast-grep search/edit worker work\n- PTY loop internals through `tokio::task::spawn_blocking`\n- image decode/resize/encode\n- HTML conversion\n- clipboard image read\n\nBehavior:\n\n- Work closure receives a cloned `CancelToken`.\n- Cancellation is only observed where code checks `ct.heartbeat()?`.\n- Closure `Err(...)` rejects the JS promise.\n\n### Use `task::future`\n\nUse when work must `await` async operations:\n\n- shell session orchestration (`Shell.run`, `executeShell`)\n- PTY outer promise (`PtySession.start`) before it enters `spawn_blocking`\n- task racing (`tokio::select!`) between completion and cancellation\n\nBehavior:\n\n- Future code can race normal completion against `ct.wait()`.\n- On cancel path, async implementations typically cancel subordinate machinery and may force-abort after a grace timeout.\n\n## JS API ↔ Rust export mapping (task/cancel relevant)\n\n| JS-facing API | Rust export | Scheduler | Cancellation hookup |\n| --------------------------------------- | ------------------------------------ | -------------------------------------------------------------- | ---------------------------------------------------------------------------------------- |\n| `grep(options, onMatch?)` | `grep` | `task::blocking(\"grep\", ct, ...)` | `CancelToken::new(options.timeoutMs, options.signal)` + heartbeat checks |\n| `glob(options, onMatch?)` | `glob` | `task::blocking(\"glob\", ct, ...)` | `CancelToken::new(...)` + heartbeat checks |\n| `fuzzyFind(options)` | `fuzzy_find` | `task::blocking(\"fuzzy_find\", ct, ...)` | `CancelToken::new(...)` + heartbeat checks |\n| `astGrep(options)` / `astEdit(options)` | ast exports | blocking worker path | timeout/signal fields are accepted by options and checked cooperatively in worker loops |\n| `Shell#run(options, onChunk?)` | `Shell::run` | `task::future(env, \"shell.run\", ...)` | `ct.wait()` raced against run task; bridges to Tokio cancellation token and `AbortToken` |\n| `executeShell(options, onChunk?)` | `execute_shell` | `task::future(env, \"shell.execute\", ...)` | same cancel race and 2s graceful window |\n| `PtySession#start(options, onChunk?)` | `PtySession::start` | `task::future(env, \"pty.start\", ...)` + inner `spawn_blocking` | `CancelToken` checked in sync PTY loop via `heartbeat()` |\n| `htmlToMarkdown(html, options?)` | `html_to_markdown` | `task::blocking(\"html_to_markdown\", (), ...)` | none (`()` token) |\n| `PhotonImage.parse/encode/resize` | `PhotonImage::{parse,encode,resize}` | `task::blocking(...)` | none (`()` token) |\n| `readImageFromClipboard()` | `read_image_from_clipboard` | `task::blocking(\"clipboard.read_image\", (), ...)` | none (`()` token) |\n\n`text.rs`, `tokens.rs`, `keys.rs`, most `ps.rs` functions, and synchronous utility exports do not use `task::blocking`/`task::future` and therefore do not participate in this cancellation path.\n\n## Cancellation lifecycle and state transitions\n\n### `CancelToken` lifecycle\n\n```text\nCreated\n ├─ no signal + no timeout -> passive token\n ├─ signal registered -> AbortSignal callback can set AbortReason::Signal\n └─ deadline set -> timeout check becomes active\n\nRunning\n ├─ heartbeat()/wait() sees signal -> AbortReason::Signal\n ├─ heartbeat()/wait() sees deadline -> AbortReason::Timeout\n ├─ wait() sees Ctrl-C -> AbortReason::User\n └─ no abort -> continue\n\nAborted\n └─ flag stores first observed cause for waiters; heartbeat formats it as \"Aborted: \"\n```\n\n### Before-start vs mid-execution cancellation\n\n- **Before start / before first cancellation check**:\n - `task::future` users that race on `ct.wait()` can resolve cancellation once they enter `select!`.\n - `task::blocking` users only observe cancellation when closure code reaches `heartbeat()`.\n\n- **Mid-execution**:\n - `blocking`: next `heartbeat()` returns `Err(\"Aborted: ...\")`.\n - `future`: `ct.wait()` branch wins `select!`, then code cancels subordinate async machinery.\n - shell: cancellation triggers a Tokio cancellation token, waits up to 2 seconds, then aborts the task if needed.\n - PTY: heartbeat failure or `kill()` terminates PTY child/process tree and drains output briefly.\n\n## Heartbeat expectations for long-running loops\n\n`heartbeat()` must run at predictable cadence in loops with unbounded or large work sets.\n\nObserved patterns:\n\n- `glob` filtering checks entries during scan/filter work.\n- `fd` scoring checks scanned candidates.\n- `grep` checks before/during expensive search and passes tokens into shared scan/cache helpers.\n- `run_pty_sync` checks every loop tick with a maximum 16ms wait cadence.\n\nPractical rule: no loop over external-size input should exceed a short bounded interval without a heartbeat.\n\n## Failure behavior and error propagation to JS\n\n### Blocking tasks\n\nError path:\n\n1. Closure returns `Err(napi::Error)` (including `heartbeat()` abort).\n2. `Task::compute()` returns `Err`.\n3. `AsyncTask` rejects JS promise.\n\nTypical error strings:\n\n- `Aborted: Timeout`\n- `Aborted: Signal`\n- domain errors (`Failed to decode image: ...`, `Conversion error: ...`, etc.)\n\n### Future tasks\n\nError path:\n\n1. Async body returns `Err(napi::Error)` or join failure is mapped (`... task failed: {err}`).\n2. `task::future`-spawned promise rejects.\n3. Shell and PTY command APIs model cancellation as structured results instead of rejection when the cancellation path wins: `exitCode` omitted, `cancelled` or `timedOut` set.\n\n### Cancellation reporting split\n\n- **Abort as error**: blocking exports using `heartbeat()?`.\n- **Abort as typed result**: shell/PTY command APIs that model cancellation in result structs.\n\nChoose one model per API and document it explicitly.\n\n## Common pitfalls\n\n1. **Missing heartbeat in blocking loops**\n - Symptom: timeout/signal appears ignored until loop ends.\n - Fix: add `ct.heartbeat()?` at loop top and before expensive per-item steps.\n\n2. **Long uncancelable sections**\n - Symptom: cancellation latency spikes during single large call (decode, sort, compression, parser invocation, etc.).\n - Fix: split work into chunks with heartbeat boundaries; if impossible, document latency.\n\n3. **Blocking async executor**\n - Symptom: async API stalls when sync-heavy code runs directly in future.\n - Fix: move CPU/sync blocks to `task::blocking` or `tokio::task::spawn_blocking`.\n\n4. **Inconsistent cancel semantics**\n - Symptom: one API rejects on cancel, another resolves with flags, confusing callers.\n - Fix: standardize per domain and keep docs aligned.\n\n5. **Forgetting cancellation bridge in nested async tasks**\n - Symptom: outer token is cancelled but inner readers/subprocess tasks keep running.\n - Fix: bridge cancellation to inner token/signal and enforce grace timeout + forced abort fallback.\n\n## Checklist for new cancellable exports\n\n1. Classify work correctly:\n - CPU-bound or sync blocking -> `task::blocking`.\n - async I/O / `await` orchestration -> `task::future`.\n\n2. Expose cancel inputs when needed:\n - include `timeoutMs` and `signal` in `#[napi(object)]` options,\n - create `let ct = task::CancelToken::new(timeout_ms, signal);`.\n\n3. Wire cancellation through all layers:\n - blocking loops: `ct.heartbeat()?` at stable intervals,\n - async orchestration: race with `ct.wait()` and cancel sub-tasks/tokens.\n\n4. Decide cancellation contract:\n - reject promise with abort error, or\n - resolve typed `{ cancelled, timedOut, ... }`,\n - keep this contract consistent for the API family.\n\n5. Propagate failures with context:\n - map errors via `Error::from_reason(format!(\"...: {err}\"))`,\n - include stage-specific prefixes (`spawn`, `decode`, `wait`, etc.).\n\n6. Handle before-start and mid-flight cancellation:\n - cancellation check/await must happen before expensive body and during long execution.\n\n7. Validate no executor misuse:\n - no long sync work directly inside async futures without `spawn_blocking`/blocking task wrapper.\n", "natives-shell-pty-process.md": "# Natives Shell, PTY, Process, and Key Internals\n\nThis document covers the execution/process/terminal primitives in `@oh-my-pi/pi-natives`: `shell`, `pty`, `ps`, and `keys`, using the architecture terms from `docs/natives-architecture.md`.\n\n## Implementation files\n\n- `crates/pi-natives/src/shell.rs`\n- `crates/pi-natives/src/shell/windows.rs` (Windows-only PATH enrichment)\n- `crates/pi-natives/src/pty.rs`\n- `crates/pi-natives/src/ps.rs`\n- `crates/pi-natives/src/keys.rs`\n- `crates/pi-natives/src/task.rs`\n- `packages/natives/native/index.d.ts`\n\n## Layer ownership\n\n- **Package entrypoint** (`packages/natives/native/index.js`): loads the `.node` addon and exports generated N-API bindings.\n- **Rust N-API module layer** (`crates/pi-natives/src/*`): shell/PTY process execution, process-tree traversal/termination, and key-sequence parsing.\n- **Consumers** (`packages/coding-agent`, `packages/tui`): higher-level session policy, output artifact/minimizer handling, render policy, and UI key handling.\n\n## Shell subsystem (`shell`)\n\n### API model\n\nTwo execution modes are exposed:\n\n1. **One-shot** via `executeShell(options, onChunk?)`.\n2. **Persistent session** via `new Shell(options?)` then `shell.run(...)` repeatedly.\n\nBoth stream output through a threadsafe callback and return `{ exitCode?, cancelled, timedOut, minimized? }`.\n\n`ShellOptions` supports `sessionEnv`, `snapshotPath`, and optional output `minimizer`. `ShellExecuteOptions` supports command-scoped `env`, session-level `sessionEnv`, `snapshotPath`, timeout/signal, and optional minimizer. `ShellRunOptions` supports command, cwd, command-scoped env, timeout, and signal.\n\n### Session creation and environment model\n\nRust creates `brush_core::Shell` with:\n\n- non-interactive, non-login mode,\n- `no_profile` and `no_rc`,\n- `do_not_inherit_env: true`,\n- bash-mode builtins, with `exec` and `suspend` disabled,\n- explicit environment reconstruction from host env,\n- skip-list for shell-sensitive vars (`PS1`, `PWD`, `SHLVL`, bash function exports, etc.).\n\nSession env behavior:\n\n- `ShellOptions.sessionEnv` / one-shot `sessionEnv` is applied at session creation.\n- `ShellRunOptions.env` / one-shot `env` is command-scoped (`EnvironmentScope::Command`) and popped after the command.\n- `PATH` is merged specially on Windows with case-insensitive dedupe.\n- Windows-only path enrichment (`shell/windows.rs`) appends discovered Git-for-Windows paths when present and not already included.\n- `snapshotPath`, when present, is sourced during session creation with stdout/stderr/stdin wired to null files.\n\n### Runtime lifecycle and state transitions\n\nPersistent shell (`Shell.run`) uses this state machine:\n\n- **Idle/Uninitialized**: `session: None`.\n- **Running**: first `run()` lazily creates a session, stores an abort token, executes command.\n- **Completed + keepalive**: if execution control flow is normal, abort state is cleared and session is reused.\n- **Completed + teardown**: if control flow is loop/script/shell-exit related, session is dropped.\n- **Cancelled/Timed out**: run task is cancelled, grace wait is 2 seconds, task may be force-aborted, session is dropped if lock can be acquired.\n- **Error**: session is dropped.\n\nOne-shot shell (`executeShell`) always creates and drops a fresh session per call.\n\n### Streaming/output and minimizer behavior\n\n- Stdout/stderr are routed into a shared pipe and read concurrently.\n- Reader decodes UTF-8 incrementally; invalid byte sequences emit `U+FFFD` replacement chunks.\n- The command runs in a new process group policy.\n- Optional minimizer configuration can capture and rewrite output. When minimization occurs, the result includes `minimized` with filter name, replacement text, original text, and byte counts.\n- Consumers are responsible for persisting or displaying minimizer artifacts; the native result only carries the data.\n\n### Cancellation, timeout, and abort\n\n- `CancelToken` is constructed from `timeoutMs` and optional `AbortSignal`.\n- On cancellation/timeout, shell cancellation token is triggered, then task gets a 2-second graceful window before forced abort.\n- Structured result flags are used:\n - timeout -> `exitCode` omitted, `timedOut: true`.\n - abort signal / `Shell.abort()` -> `exitCode` omitted, `cancelled: true`.\n\n`Shell.abort()` behavior:\n\n- aborts the current running command for that `Shell` instance through the stored `AbortToken`,\n- resolves successfully even when nothing is running.\n\n### Failure behavior\n\nCommon surfaced errors include:\n\n- session init failures (`Failed to initialize shell`),\n- cwd errors (`Failed to set cwd`),\n- env set/pop failures,\n- snapshot source failures (`Failed to source snapshot`),\n- pipe creation/clone failures,\n- execution failure (`Shell execution failed: ...`),\n- task wrapper failures (`Shell execution task failed: ...`).\n\n## PTY subsystem (`pty`)\n\n### API model\n\n`new PtySession()` exposes:\n\n- `start(options, onChunk?) -> Promise<{ exitCode?, cancelled, timedOut }>`\n- `write(data)`\n- `resize(cols, rows)`\n- `kill()`\n\n`PtyStartOptions` supports `command`, optional `cwd`, optional `env`, `timeoutMs`, `signal`, `cols`, and `rows`.\n\n### Runtime lifecycle and state transitions\n\n`PtySession` state machine:\n\n- **Idle**: `core: None`.\n- **Reserved**: `start()` installs control channel synchronously (`core: Some`) before async work begins, so `write/resize/kill` become immediately valid.\n- **Running**: blocking PTY loop handles child state, reader events, cancellation heartbeat, and control messages.\n- **Terminal closed / drain**: child exit or cancellation starts a short reader drain window.\n- **Finalized**: `core` is always reset to `None` after start task completion (success or error).\n\nConcurrency guard:\n\n- starting while already running returns `PTY session already running`.\n\n### Spawn/attach/write/read/terminate patterns\n\n- PTY opened via `portable_pty::native_pty_system().openpty(...)`.\n- Command currently runs as `sh -lc ` with optional `cwd` and env overrides.\n- Default size is `120x40`; dimensions are clamped (`cols 20..400`, `rows 5..200`).\n- `write()` sends raw bytes to PTY stdin.\n- `resize()` sends a control message and clamps dimensions again.\n- `kill()` sends a control message that marks the run cancelled and terminates the child/process tree.\n\nOutput path:\n\n- dedicated reader thread reads master stream,\n- incremental UTF-8 decode emits `U+FFFD` for invalid bytes,\n- chunks forwarded through N-API threadsafe callback.\n\nTermination path:\n\n- Unix: terminate process group when known, terminate child tree, call child kill, then repeat with SIGKILL.\n- Non-Unix: terminate child tree, call child kill, then repeat with SIGKILL-equivalent process-tree helper.\n\n### Cancellation and timeout semantics\n\n- `timeoutMs` and `AbortSignal` feed a `CancelToken`.\n- Loop calls `ct.heartbeat()` periodically with a 16ms maximum wait cadence.\n- Timeout classification is based on the heartbeat error string containing `Timeout`.\n- Cancellation/kill starts a 300ms post-cancel drain window; normal child exit starts a 300ms post-exit drain window.\n\n### Failure behavior\n\nError surfaces include:\n\n- PTY allocation/open failure,\n- PTY spawn failure,\n- writer/reader acquisition failure,\n- child status/wait failures,\n- lock poisoning,\n- control-channel disconnection (`PTY session is no longer available`).\n\nControl call failures when not running:\n\n- `write/resize/kill` return `PTY session is not running`.\n\n## Process-tree subsystem (`ps`)\n\n### API model\n\n- `killTree(pid, signal) -> number`\n- `listDescendants(pid) -> number[]`\n\n### Platform-specific implementation\n\n- **Linux**: recursively reads `/proc//task//children`.\n- **macOS**: uses `libproc` `proc_listchildpids`.\n- **Windows**: snapshots process table with `CreateToolhelp32Snapshot`, builds parent->children map, terminates with `OpenProcess(PROCESS_TERMINATE)` + `TerminateProcess`.\n\n### Kill-tree behavior\n\n- Descendants are collected recursively.\n- Kill order is bottom-up (deepest descendants first).\n- Root pid is killed last.\n- Return value is count of successful terminations.\n\nSignal behavior:\n\n- POSIX: provided `signal` is passed to `kill`.\n- Windows: `signal` is ignored; termination is unconditional process terminate.\n\n### Failure behavior\n\nThis module is intentionally non-throwing at API surface for ordinary process misses:\n\n- missing/inaccessible process tree branches are skipped,\n- per-pid kill failures are counted as unsuccessful,\n- lookup miss typically yields `[]` from `listDescendants` and `0` from `killTree`.\n\n## Key parsing subsystem (`keys`)\n\n### API model\n\nExposed helpers:\n\n- `parseKey(data, kittyProtocolActive)`\n- `matchesKey(data, keyId, kittyProtocolActive)`\n- `parseKittySequence(data)`\n- `matchesKittySequence(data, expectedCodepoint, expectedModifier)`\n- `matchesLegacySequence(data, keyName)`\n\n### Parsing model\n\nThe parser combines:\n\n- direct single-byte mappings (`enter`, `tab`, `ctrl+`, printable ASCII),\n- O(1) legacy escape-sequence lookup (PHF map),\n- xterm `modifyOtherKeys` parsing,\n- Kitty protocol parsing (`CSI u`, `CSI ~`, `CSI 1;...`),\n- normalization to key IDs (`ctrl+c`, `shift+tab`, `pageUp`, `f5`, etc.).\n\nModifier handling:\n\n- only shift/alt/ctrl bits are compared for key matching,\n- lock bits are masked out before comparisons.\n\nLayout behavior:\n\n- base-layout fallback is intentionally constrained so remapped layouts do not create false matches for ASCII letters/symbols.\n\n### Failure behavior\n\n- Unrecognized or invalid sequences produce `null` from parse functions.\n- Match functions return `false` on parse failure or mismatch.\n- No thrown error surface for malformed key input.\n\n## JS API ↔ Rust export mapping\n\n### Shell + PTY + Process\n\n| JS API | Rust N-API export | Notes |\n| --------------------------------- | -------------------------------------- | ----------------------------------------- |\n| `executeShell(options, onChunk?)` | `executeShell` (`execute_shell`) | One-shot shell execution |\n| `new Shell(options?)` | `Shell` class | Persistent shell session |\n| `shell.run(options, onChunk?)` | `Shell::run` | Reuses session on keepalive control flow |\n| `shell.abort()` | `Shell::abort` | Aborts active run for that shell instance |\n| `new PtySession()` | `PtySession` class | Stateful PTY session |\n| `pty.start(options, onChunk?)` | `PtySession::start` | Interactive PTY run |\n| `pty.write(data)` | `PtySession::write` | Raw stdin passthrough |\n| `pty.resize(cols, rows)` | `PtySession::resize` | Clamped terminal dimensions |\n| `pty.kill()` | `PtySession::kill` | Force-kills active PTY child |\n| `killTree(pid, signal)` | `killTree` (`kill_tree`) | Children-first process tree termination |\n| `listDescendants(pid)` | `listDescendants` (`list_descendants`) | Recursive descendants listing |\n\n### Keys\n\n| JS API | Rust N-API export | Notes |\n| ---------------------------------------------- | --------------------------------------------------- | ------------------------------- |\n| `matchesKittySequence(data, cp, mod)` | `matchesKittySequence` (`matches_kitty_sequence`) | Kitty codepoint+modifier match |\n| `parseKey(data, kittyProtocolActive)` | `parseKey` (`parse_key`) | Normalized key-id parser |\n| `matchesLegacySequence(data, keyName)` | `matchesLegacySequence` (`matches_legacy_sequence`) | Exact legacy sequence map check |\n| `parseKittySequence(data)` | `parseKittySequence` (`parse_kitty_sequence`) | Structured Kitty parse result |\n| `matchesKey(data, keyId, kittyProtocolActive)` | `matchesKey` (`matches_key`) | High-level key matcher |\n\n## Abandoned session cleanup and finalization notes\n\n- **Shell persistent session**: if a run is cancelled/timed out/errors/non-keepalive control flow, Rust drops the internal session state. Successful normal runs keep the session for reuse.\n- **PTY session**: `core` is always cleared after `start()` finishes, including failure paths.\n- **No explicit JS finalizer-driven kill contract** is exposed by wrappers; cleanup is primarily tied to run completion/cancellation paths. Callers should use `timeoutMs`, `AbortSignal`, `shell.abort()`, or `pty.kill()` for deterministic teardown.\n", "natives-text-search-pipeline.md": "# Natives Text/Search Pipeline\n\nThis document maps the `@oh-my-pi/pi-natives` text/search/code surface from generated JS/TS exports to Rust N-API modules and back to JS result objects.\n\nTerminology follows `docs/natives-architecture.md`:\n\n- **Generated binding**: public API in `packages/natives/native/index.d.ts`.\n- **Rust module layer**: N-API exports in `crates/pi-natives/src/*`.\n- **Shared scan cache**: `fs_cache`-backed directory-entry cache used by discovery/search flows.\n\n## Implementation files\n\n- `packages/natives/native/index.d.ts`\n- `crates/pi-natives/src/grep.rs`\n- `crates/pi-natives/src/glob.rs`\n- `crates/pi-natives/src/glob_util.rs`\n- `crates/pi-natives/src/fs_cache.rs`\n- `crates/pi-natives/src/fd.rs`\n- `crates/pi-natives/src/ast.rs`\n- `crates/pi-natives/src/text.rs`\n- `crates/pi-natives/src/highlight.rs`\n- `crates/pi-natives/src/tokens.rs`\n\n## JS API ↔ Rust export mapping\n\n| JS API | Rust export (`#[napi]`, snake_case -> camelCase) | Rust module |\n| ------------------------------------------------------------------------------- | ------------------------------------------------ | -------------- |\n| `grep(options, onMatch?)` | `grep` | `grep.rs` |\n| `search(content, options)` | `search` | `grep.rs` |\n| `hasMatch(content, pattern, ignoreCase?, multiline?)` | `hasMatch` | `grep.rs` |\n| `fuzzyFind(options)` | `fuzzyFind` | `fd.rs` |\n| `glob(options, onMatch?)` | `glob` | `glob.rs` |\n| `invalidateFsScanCache(path?)` | `invalidateFsScanCache` | `fs_cache.rs` |\n| `astGrep(options)` | `astGrep` | `ast.rs` |\n| `astEdit(options)` | `astEdit` | `ast.rs` |\n| `wrapTextWithAnsi(text, width, tabWidth)` | `wrapTextWithAnsi` | `text.rs` |\n| `truncateToWidth(text, maxWidth, ellipsis, pad, tabWidth)` | `truncateToWidth` | `text.rs` |\n| `sliceWithWidth(line, startCol, length, strict, tabWidth)` | `sliceWithWidth` | `text.rs` |\n| `extractSegments(line, beforeEnd, afterStart, afterLen, strictAfter, tabWidth)` | `extractSegments` | `text.rs` |\n| `visibleWidth(text, tabWidth)` | `visibleWidth` | `text.rs` |\n| `highlightCode(code, lang, colors)` | `highlightCode` | `highlight.rs` |\n| `supportsLanguage(lang)` | `supportsLanguage` | `highlight.rs` |\n| `getSupportedLanguages()` | `getSupportedLanguages` | `highlight.rs` |\n| `countTokens(input, encoding?)` | `countTokens` | `tokens.rs` |\n\n## Pipeline overview by subsystem\n\n## 1) Regex search (`grep`, `search`, `hasMatch`)\n\n### Input/options flow\n\n1. Callers invoke generated native exports directly; there is no package-local TS wrapper that renames `search` to `searchContent`.\n2. Rust option structs in `grep.rs` deserialize camelCase fields (`ignoreCase`, `maxCount`, `contextBefore`, `contextAfter`, `maxColumns`, `timeoutMs`).\n3. `grep` creates `CancelToken` from `timeoutMs` + `AbortSignal` and runs inside `task::blocking(\"grep\", ...)`.\n4. `search` and `hasMatch` operate on provided string/`Uint8Array` content and do not scan the filesystem.\n\n### Execution branches\n\n- **In-memory branch**\n - `search` -> `search_sync` / search helpers over provided content bytes.\n - `hasMatch` compiles/checks pattern against provided content and returns a boolean.\n - No filesystem scan, no `fs_cache`.\n- **Single-file branch**\n - `grep` resolves path, checks metadata is file, and searches that file.\n- **Directory branch**\n - Optional cache lookup via `fs_cache::get_or_scan` when `cache: true`.\n - Fresh scan via `fs_cache::force_rescan` when `cache: false`.\n - Optional empty-result recheck when cached results are older than the empty-result recheck threshold.\n - Entry filtering: file-only + optional glob filter (`glob_util`) + optional type filter mapping (`js`, `ts`, `rust`, etc.).\n\n### Search/collection semantics\n\n- Regex engine: `grep_regex::RegexMatcherBuilder` with `ignoreCase` and `multiline`.\n- Context resolution:\n - `contextBefore/contextAfter` override legacy `context`.\n - Non-content modes do not collect context.\n- Output modes:\n - `content` -> one `GrepMatch` per hit.\n - `count` and `filesWithMatches` map to count-style entries (`lineNumber=0`, `line=\"\"`, `matchCount` set).\n- Limits:\n - Global `offset` and `maxCount` apply across files.\n - Parallel path is used only when `maxCount` is unset and `offset == 0`; otherwise sequential path preserves deterministic global offset/limit semantics.\n\n### Result shaping back to JS\n\n- Rust `SearchResult`/`GrepResult` fields map to TS interfaces via N-API object conversion.\n- Counters are clamped before crossing N-API where needed.\n- `GrepResult.limitReached` is optional and emitted when true.\n- Streaming callback receives each shaped `GrepMatch` for content or count-style entries.\n\n### Failure behavior\n\n- `search` returns `SearchResult.error` for regex/search failures instead of throwing.\n- `grep` rejects on hard errors such as invalid path, invalid glob/regex, or cancellation timeout/abort.\n- `hasMatch` returns a boolean on success and throws on invalid pattern/UTF-8 conversion errors.\n- File open/search errors in multi-file scans are skipped per-file; scan continues.\n\n### Malformed regex handling\n\n`grep.rs` sanitizes braces before regex compile:\n\n- Invalid repetition-like braces are escaped (`{`/`}` -> `\\{`/`\\}`) when they cannot form `{N}`, `{N,}`, `{N,M}`.\n- This prevents common literal-template fragments (for example `${platform}`) from failing as malformed repetition.\n- Remaining invalid regex syntax still returns a regex error.\n\n## 2) File discovery (`glob`) and fuzzy path search (`fuzzyFind`)\n\n`glob` and `fuzzyFind` share `fs_cache` scans; matching logic differs.\n\n### `glob` flow\n\n1. Caller passes `GlobOptions` directly. `pattern` and `path` are required in the generated type.\n2. Rust resolves the search path and compiles pattern via `glob_util::compile_glob`.\n3. Entry source:\n - `cache=true` -> `get_or_scan` + optional stale-empty `force_rescan`.\n - `cache=false` -> `force_rescan(..., store=false)` (fresh only).\n4. Filtering:\n - skip `.git` always;\n - skip `node_modules` unless requested (`includeNodeModules`) or pattern mentions `node_modules`;\n - apply glob match;\n - apply file-type filter; symlink `file`/`dir` filters resolve target metadata.\n5. Optional sort by mtime descending (`sortByMtime`) before truncating to `maxResults`.\n\n### `fuzzyFind` flow\n\n1. Rust implementation lives in `fd.rs`; generated export is `fuzzyFind`.\n2. Shared scan source from `fs_cache` with the same cache/no-cache split and stale-empty recheck policy.\n3. Scoring:\n - exact / starts-with / contains / subsequence-based fuzzy score;\n - separator/punctuation-normalized scoring path;\n - directory bonus and deterministic tie-break (`score desc`, then `path asc`).\n4. Symlink entries are excluded from fuzzy results.\n\n### Failure behavior\n\n- Invalid glob pattern returns an error from `glob_util::compile_glob`.\n- Search root must resolve to an existing directory for directory discovery flows.\n- Cancellation/timeouts propagate as abort errors via `CancelToken::heartbeat()` checks in loops.\n\n### Malformed glob handling\n\n`glob_util::build_glob_pattern` is tolerant:\n\n- normalizes `\\` to `/`,\n- auto-prefixes simple recursive patterns with `**/` when `recursive=true`,\n- auto-closes unbalanced `{...` alternation groups before compile.\n\n## 3) AST search/edit (`astGrep`, `astEdit`)\n\n`ast.rs` exposes syntax-aware code search and rewrite operations.\n\n- `astGrep(options)` returns matches with byte/line/column coordinates and optional metavariable bindings.\n- `astEdit(options)` returns replacement changes, per-file counts, searched/touched file counts, parse errors, and whether edits were applied.\n- `dryRun` defaults to true for edit options in the generated documentation.\n- Options include language override, path/glob/selector, strictness, limits, parse-error policy, `signal`, and `timeoutMs`.\n\nThese exports are direct native APIs used by tooling; they are not mediated by a TS wrapper in `packages/natives`.\n\n## 4) Shared scan/cache lifecycle (`fs_cache`)\n\n`fs_cache` stores scan results as normalized relative entries (`path`, `fileType`, optional `mtime`) keyed by:\n\n- canonical search root,\n- `include_hidden`,\n- `use_gitignore`.\n\n### Cache state transitions\n\n1. **Miss / disabled**\n - TTL is `0` or key absent/expired -> fresh collection.\n2. **Hit**\n - Entry age is within TTL -> return cached entries + `cache_age_ms`.\n3. **Stale-empty recheck**\n - If query yields zero matches and cache age exceeds the empty-result threshold, force one rescan.\n4. **Invalidation**\n - `invalidateFsScanCache(path?)`:\n - no arg: clear all keys;\n - path arg: remove keys for roots affected by that path.\n\n### Stale-result tradeoff\n\n- Cache favors low-latency repeated scans over immediate consistency.\n- TTL window can return stale positives/negatives.\n- Empty-result recheck reduces stale negatives for older cached scans at the cost of one extra scan.\n- Explicit invalidation is the intended correctness hook after file mutations.\n\n## 5) ANSI text utilities (`text`)\n\nThese are pure, in-memory utilities.\n\n### Boundaries and responsibilities\n\n- `text.rs` owns terminal-cell semantics:\n - ANSI sequence parsing,\n - grapheme-aware width and slicing,\n - wrap/truncate/sanitize behavior,\n - explicit tab-width parameter on width-sensitive APIs.\n- `grep.rs` line truncation (`maxColumns`) is separate:\n - simple character-boundary truncation of matched lines with `...`,\n - not ANSI-state-preserving and not terminal-cell width aware.\n\n### Key behaviors\n\n- `wrapTextWithAnsi`: wraps by visible width, carries active SGR codes across wrapped lines.\n- `truncateToWidth`: visible-cell truncation with ellipsis policy (`Unicode`, `Ascii`, `Omit`), optional right padding.\n- `sliceWithWidth`: column slicing with optional strict width enforcement.\n- `extractSegments`: extracts before/after segments around an overlay while restoring ANSI state for the `after` segment.\n- `sanitizeText` (ANSI/control/surrogate stripping with line-ending normalization) no longer lives in `text.rs`; it moved to `@oh-my-pi/pi-utils` as a pure-JS implementation in `packages/utils/src/sanitize-text.ts`. The native binding was removed in the same change because the JS version was competitive on the benchmarked workloads, and keeping a Rust copy forced every caller (including `pi-utils`) to pull in `@oh-my-pi/pi-natives`.\n- `visibleWidth`: counts visible terminal cells using caller-supplied tab width.\n\n### Failure behavior\n\nText functions generally return deterministic transformed output; errors are limited to N-API argument/string conversion boundaries.\n\n## 6) Syntax highlighting (`highlight`)\n\n`highlight.rs` is pure transformation; it does not use the filesystem scan cache.\n\n### Flow\n\n1. Caller passes `code`, optional `lang`, and ANSI color palette.\n2. Rust resolves syntax by token/name lookup, extension lookup, alias table fallback, then plain-text fallback.\n3. Each line is parsed with syntect `ParseState` and scope stack.\n4. Scopes map to semantic color categories and ANSI color codes are injected/reset.\n\n### Failure behavior\n\n- Per-line parse failure does not fail the call: that line is appended unhighlighted and processing continues.\n- Unknown/unsupported language falls back to plain text syntax.\n\n## 7) Token counting (`tokens`)\n\n`countTokens(input, encoding?)` is an in-memory utility.\n\n- `input` may be a single string or an array of strings.\n- Arrays return one aggregate count and are encoded in parallel in Rust.\n- Default encoding is `O200kBase`; `Cl100kBase` is also available.\n- The implementation uses ordinary tokenization, not special-token handling.\n\n## Pure utility vs filesystem-dependent flows\n\n| Flow | Filesystem access | Shared cache | Notes |\n| ---------------------------- | ----------------- | -------------------- | --------------------------------------------- |\n| `search` / `hasMatch` | No | No | regex on provided bytes/string only |\n| `text` module functions | No | No | ANSI/width/sanitization only |\n| `highlight` module functions | No | No | syntax + ANSI coloring only |\n| `countTokens` | No | No | tokenization only |\n| `astGrep` / `astEdit` | Yes | No | syntax-aware file search/edit |\n| `glob` | Yes | Optional | directory scans + glob filtering |\n| `fuzzyFind` | Yes | Optional | directory scans + fuzzy scoring |\n| `grep` (file/dir path) | Yes | Optional in dir mode | ripgrep over files, optional filters/callback |\n\n## End-to-end lifecycle summary\n\n1. Caller invokes generated native export with typed options.\n2. Rust validates/normalizes options and builds matcher/search config.\n3. For filesystem flows, entries are scanned (cache hit/miss/rescan where applicable) then filtered/scored/searched.\n4. Worker loops periodically call cancel heartbeat; timeout/abort can terminate execution.\n5. Rust shapes outputs into N-API objects (`lineNumber`, `matchCount`, `limitReached`, etc.).\n6. Generated bindings return typed JS objects and optional per-match callbacks for `grep`/`glob`.\n", "non-compaction-retry-policy.md": "# Non-compaction auto-retry policy\n\nThis document describes the standard API-error retry path in `AgentSession`.\n\nIt explicitly excludes context-overflow recovery via auto-compaction. Overflow is handled by compaction logic and is documented separately in [`compaction.md`](../docs/compaction.md).\n\n## Implementation files\n\n- [`../src/session/agent-session.ts`](../packages/coding-agent/src/session/agent-session.ts)\n- [`../src/config/settings-schema.ts`](../packages/coding-agent/src/config/settings-schema.ts)\n- [`../src/modes/controllers/event-controller.ts`](../packages/coding-agent/src/modes/controllers/event-controller.ts)\n- [`../src/modes/rpc/rpc-mode.ts`](../packages/coding-agent/src/modes/rpc/rpc-mode.ts)\n- [`../src/modes/rpc/rpc-client.ts`](../packages/coding-agent/src/modes/rpc/rpc-client.ts)\n- [`../src/modes/rpc/rpc-types.ts`](../packages/coding-agent/src/modes/rpc/rpc-types.ts)\n\n## Scope boundary vs compaction\n\nRetry and compaction are checked from the same `agent_end` path, but they are intentionally separated:\n\n1. `agent_end` inspects the last assistant message.\n2. `#isRetryableError(...)` runs first.\n3. If retry is initiated, compaction checks are skipped for that turn.\n4. Context-overflow errors are hard-excluded from retry classification (`isContextOverflow(...)` short-circuits retry).\n5. Overflow therefore falls through to `#checkCompaction(...)` instead of standard retry.\n\nSo: overload/rate/server/network-style failures use this retry policy; context-window overflow uses compaction recovery.\n\n## Retry classification\n\n`#isRetryableError(...)` requires all of the following:\n\n- assistant `stopReason === \"error\"`\n- `errorMessage` exists\n- message is **not** context overflow\n- `errorMessage` matches transient transport/envelope patterns or `isUsageLimitError(...)`\n\nCurrent retryable inputs are regex/string-classified:\n\n- transient transport/envelope failures, including Anthropic stream-envelope failures before `message_start`\n- overloaded/provider-returned-error wording\n- rate limit / usage limit / too many requests\n- HTTP-like server classes: 429, 500, 502, 503, 504\n- service unavailable / server/internal error\n- provider-suggested retry wording, including OpenAI `retry your request` failures\n- network/connection/socket failures, refused/closed connections, upstream connect/reset-before-headers, socket hang up, timeout/timed out, fetch failed, terminated, retry delay wording, and unexpected socket close messages\n\nThis is string-pattern classification, not typed provider error codes.\n\n## Retry lifecycle and state transitions\n\nSession state used by retry:\n\n- `#retryAttempt: number` (`0` means idle)\n- `#retryPromise: Promise | undefined` (tracks in-progress retry lifecycle)\n- `#retryResolve: (() => void) | undefined` (resolves `#retryPromise`)\n- `#retryAbortController: AbortController | undefined` (cancels backoff sleep)\n\nFlow (`#handleRetryableError`):\n\n1. Read `retry` settings group.\n2. If `retry.enabled === false`, stop immediately (`false`, no retry started).\n3. Increment `#retryAttempt`.\n4. Create `#retryPromise` once (first attempt in a chain).\n5. If attempt exceeded `retry.maxRetries`, emit final failure event and stop.\n6. Compute base delay: `retry.baseDelayMs * 2^(attempt-1)`.\n7. For usage-limit errors, parse retry hints and call auth storage (`markUsageLimitReached(...)`); if credential switching succeeds, force delay to `0`, otherwise use a larger retry-after/backoff hint when present.\n8. If no credential switch occurred, suppress the current model selector for cooldown, try configured retry model fallback chains, and force delay to `0` on model switch.\n9. Emit `auto_retry_start`.\n10. Remove the trailing assistant error message from agent runtime state (kept in persisted session history).\n11. Sleep with abort support.\n12. Schedule `agent.continue()` through the post-prompt task scheduler (`delayMs: 1`) for the same prompt generation.\n\n### What resets retry counters\n\n`#retryAttempt` resets to `0` in these cases:\n\n- first successful non-error, non-aborted assistant message after retries started (emits `auto_retry_end { success: true }`)\n- retry cancellation during backoff sleep\n- max retries exceeded path\n\n`#retryPromise` resolves/clears when retry chain ends (success, cancellation, or max-exceeded), via `#resolveRetry()`.\n\n## Backoff and max-attempt semantics\n\nSettings:\n\n- `retry.enabled` (default `true`)\n- `retry.maxRetries` (default `3`)\n- `retry.baseDelayMs` (default `2000`)\n\nAttempt numbering:\n\n- attempt counter is incremented before max-check\n- start events use current attempt (1-based)\n- max-exceeded end event reports `attempt: this.#retryAttempt - 1` (last attempted retry count)\n\nBackoff sequence with default settings:\n\n- attempt 1: 2000 ms\n- attempt 2: 4000 ms\n- attempt 3: 8000 ms\n\nDelay override inputs can come from parsed retry headers (`retry-after-ms`, `retry-after`, `x-ratelimit-reset-ms`, `x-ratelimit-reset`) or usage-limit backoff. Credential/model fallback switches set delay to `0`; otherwise parsed hints can extend the exponential local delay.\n\n## Abort mechanics\n\n### Explicit retry abort\n\n`abortRetry()`:\n\n- aborts `#retryAbortController` (if present)\n- resolves retry promise (`#resolveRetry()`) so awaiters are unblocked\n\nIf abort hits while sleeping, catch path emits:\n\n- `auto_retry_end { success: false, finalError: \"Retry cancelled\" }`\n- resets attempt/controller\n\n### Global operation abort interaction\n\n`abort()` calls `abortRetry()` before aborting the active agent stream. This guarantees retry backoff is cancelled when user issues a general abort.\n\n### TUI interaction\n\nOn `auto_retry_start`, EventController:\n\n- swaps `Esc` handler to `session.abortRetry()`\n- renders loader text: `Retrying (attempt/maxAttempts) in Ns… (esc to cancel)`\n\nOn `auto_retry_end`, it restores prior `Esc` handler and clears loader state.\n\n## Streaming and prompt completion behavior\n\n`prompt()` ultimately waits on `#waitForRetry()` after `agent.prompt(...)` returns.\n\nEffect:\n\n- a prompt call does not fully resolve until any started retry chain finishes (success/failure/cancel)\n- retry lifecycle is part of one logical prompt execution boundary\n\nThis prevents callers from treating a retrying turn as complete too early.\n\n## Controls: settings and RPC\n\n### Configuration knobs\n\nDefined in settings schema under retry group:\n\n- `retry.enabled`\n- `retry.maxRetries`\n- `retry.baseDelayMs`\n- `retry.fallbackChains`\n- `retry.fallbackRevertPolicy` (`\"cooldown-expiry\"` by default; `\"never\"` disables automatic restoration)\n\nProgrammatic toggles in session:\n\n- `setAutoRetryEnabled(enabled)` writes `retry.enabled`\n- `autoRetryEnabled` reads `retry.enabled`\n- `isRetrying` reports whether retry lifecycle promise is active\n\n### RPC controls\n\nRPC command surface:\n\n- `set_auto_retry` → `session.setAutoRetryEnabled(command.enabled)`\n- `abort_retry` → `session.abortRetry()`\n\nClient helpers:\n\n- `RpcClient.setAutoRetry(enabled)`\n- `RpcClient.abortRetry()`\n\nBoth commands return success responses; retry progress/failure details come from streamed session events, not command response payloads.\n\n## Event emission and failure surfacing\n\nSession-level retry events:\n\n- `auto_retry_start { attempt, maxAttempts, delayMs, errorMessage }`\n- `auto_retry_end { success, attempt, finalError? }`\n- `retry_fallback_applied { from, to, role }`\n- `retry_fallback_succeeded { model, role }`\n\nPropagation:\n\n- emitted through `AgentSession.subscribe(...)`\n- forwarded to extension runner as extension events\n- in RPC mode, forwarded directly as JSON event objects (`session.subscribe(event => output(event))`)\n- in TUI, consumed by `EventController` for loader/error UI\n\nFinal failure surfacing:\n\n- On max-exceeded or cancellation, `auto_retry_end.success === false`\n- TUI shows: `Retry failed after N attempts: `\n- Extensions/hooks receive `auto_retry_end` with same fields\n- RPC consumers receive same event object on stdout stream\n\n## Permanent stop conditions\n\nRetry stops and will not auto-continue when any of these occur:\n\n- `retry.enabled` is false\n- error is not retry-classified\n- error is context overflow (delegated to compaction path)\n- max retries exceeded\n- user cancels retry (`abort_retry` or `Esc` during retry loader)\n- global abort (`abort`) cancels retry first\n\nA new retry chain can still start later on a future retryable error after counters reset.\n\n## Operational caveats\n\n- Classification is regex text matching; provider-specific structured errors are not used here.\n- Retry strips the failing assistant error from **runtime context** before re-continue, but session history still keeps that error entry.\n- `RpcSessionState` currently exposes `autoCompactionEnabled` but not an `autoRetryEnabled` field; RPC callers must track their own toggle state or query settings through other APIs.\n- Model fallback changes append temporary `model_change` entries and may later restore the primary model when its cooldown expires, depending on `retry.fallbackRevertPolicy`.\n", "notebook-tool-runtime.md": "# Notebook tool runtime internals\n\nThis document describes the current `notebook` tool implementation and its relationship to the kernel-backed Python runtime.\n\nThe critical distinction: **`notebook` is a JSON notebook editor, not a notebook executor**. It edits `.ipynb` cell sources directly; it does not start or talk to a Python kernel.\n\n## Implementation files\n\n- [`src/tools/notebook.ts`](../packages/coding-agent/src/tools/notebook.ts)\n- [`src/eval/py/executor.ts`](../packages/coding-agent/src/eval/py/executor.ts)\n- [`src/eval/py/kernel.ts`](../packages/coding-agent/src/eval/py/kernel.ts)\n- [`src/session/streaming-output.ts`](../packages/coding-agent/src/session/streaming-output.ts)\n- [`src/tools/eval.ts`](../packages/coding-agent/src/tools/eval.ts)\n\n## 1) Runtime boundary: editing vs executing\n\n## `notebook` tool (`src/tools/notebook.ts`)\n\n- Supports `action: edit | insert | delete` on a `.ipynb` file.\n- Resolves path relative to session CWD (`resolveToCwd`).\n- Loads notebook JSON, validates `cells` array, validates `cell_index` bounds.\n- Applies source edits in-memory and writes full notebook JSON back with `JSON.stringify(notebook, null, 1)`.\n- Returns textual summary + structured `details` (`action`, `cellIndex`, `cellType`, `totalCells`, `cellSource`).\n\nNo kernel lifecycle exists in this tool:\n\n- no gateway acquisition\n- no kernel session ID\n- no `execute_request`\n- no stream chunks from kernel channels\n- no rich display capture (`image/png`, JSON display, status MIME)\n\n## Notebook-like execution path (`src/tools/eval.ts` + `src/eval/py/*`)\n\nWhen the agent needs to run cell-style Python code (sequential cells, persistent state, rich displays), that goes through the **`eval` tool** with `language: \"python\"`, not `notebook`.\n\nThat path is where kernel modes, restart/cancel behavior, chunk streaming, and output artifact truncation live.\n\n## 2) Notebook cell handling semantics (`notebook` tool)\n\n## Source normalization\n\n`content` is split into `source: string[]` with newline preservation:\n\n- each non-final line keeps trailing `\\n`\n- final line has no forced trailing newline\n\nThis mirrors notebook JSON conventions and avoids accidental line concatenation on later edits.\n\n## Action behavior\n\n- `edit`\n - replaces `cells[cell_index].source`\n - preserves existing `cell_type`\n- `insert`\n - inserts at `[0..cellCount]`\n - `cell_type` defaults to `code`\n - code cells initialize `execution_count: null` and `outputs: []`\n - markdown cells initialize only `metadata` + `source`\n- `delete`\n - removes `cells[cell_index]`\n - returns removed `source` in details for renderer preview\n\n## Error surfaces\n\nHard failures are thrown for:\n\n- missing notebook file\n- invalid JSON\n- missing/non-array `cells`\n- out-of-range index (insert and non-insert have different valid ranges)\n- missing `content` for `edit`/`insert`\n\nThese become `Error:` tool responses upstream; renderer uses notebook path + formatted error text.\n\n## 3) Kernel session semantics (where they actually exist)\n\nKernel semantics are implemented in `executePython` / `PythonKernel` and apply to the Python backend of the `eval` tool.\n\n## Modes\n\n`PythonKernelMode`:\n\n- `session` (default)\n - kernels cached in `kernelSessions` map\n - max 4 sessions; oldest evicted on overflow\n - idle/dead cleanup every 30s, timeout after 5 minutes\n - per-session queue serializes execution (`session.queue`)\n- `per-call`\n - creates kernel for request\n - executes\n - always shuts down kernel in `finally`\n\n## Reset behavior\n\n`eval` passes `reset` only for the first cell in a multi-cell Python call; later cells always run with `reset: false`.\n\n## Kernel death / restart / retry\n\nIn session mode (`withKernelSession`):\n\n- dead kernel detected by heartbeat (`kernel.isAlive()` check every 5s) or execute failure.\n- pre-run dead state triggers `restartKernelSession`.\n- execute-time crash path retries once: restart kernel, rerun handler.\n- `restartCount > 1` in same session throws `Python kernel restarted too many times in this session`.\n\nStartup retry behavior:\n\n- shared gateway kernel creation retries once on `SharedGatewayCreateError` with HTTP 5xx.\n\nResource exhaustion recovery:\n\n- detects `EMFILE`/`ENFILE`/\"Too many open files\" style failures\n- clears tracked sessions\n- calls `shutdownSharedGateway()`\n- retries kernel session creation once\n\n## 4) Environment/session variable injection\n\nKernel startup receives the optional session file path from executor:\n\n- `PI_SESSION_FILE` (session state file path)\n\n`PythonKernel.#initializeKernelEnvironment(...)` then runs init script inside kernel to:\n\n- `os.chdir(cwd)`\n- inject env entries into `os.environ`\n- prepend cwd to `sys.path` if missing\n\nImplication:\n\n- prelude helpers that read session context rely on this env var in Python process state.\n\n## 5) Streaming/chunk and display handling (kernel-backed path)\n\nThe kernel client processes Jupyter protocol messages per execution:\n\n- `stream` -> text chunk to `onChunk`\n- `execute_result` / `display_data` ->\n - display text chosen by MIME precedence: `text/markdown` > `text/plain` > converted `text/html`\n - structured outputs captured separately:\n - `application/json` -> `{ type: \"json\" }`\n - `image/png` -> `{ type: \"image\" }`\n - `application/x-omp-status` -> `{ type: \"status\" }` (no text emission)\n- `error` -> traceback text pushed to chunk stream + structured error metadata\n- `input_request` -> emits stdin warning text, sends empty `input_reply`, marks stdin requested\n- completion waits for both `execute_reply` and kernel `status=idle`\n\nCancellation/timeout:\n\n- abort signal triggers `interrupt()` (REST `/interrupt` + control-channel `interrupt_request`)\n- result marks `cancelled=true`\n- timeout path annotates output with `Command timed out after seconds`\n\n## 6) Truncation and artifact behavior\n\n`OutputSink` in `src/session/streaming-output.ts` is used by kernel execution paths (`executeWithKernel`):\n\n- sanitizes every chunk (`sanitizeText`)\n- tracks total/output lines and bytes\n- optional artifact spill file (`artifactPath`, `artifactId`)\n- when in-memory buffer exceeds threshold (`DEFAULT_MAX_BYTES` unless overridden):\n - marks truncated\n - keeps tail bytes in memory (UTF-8 safe boundary)\n - can spill full stream to artifact sink\n\n`dump()` returns:\n\n- visible output text (possibly tail-truncated)\n- truncation flag + counts\n- artifact ID (for `artifact://` references)\n\n`eval` converts this metadata into result truncation notices and TUI warnings.\n\n`notebook` tool does **not** use `OutputSink`; it has no stream/artifact truncation pipeline because it does not execute code.\n\n## 7) Renderer assumptions and formatting\n\n## Notebook renderer (`notebookToolRenderer`)\n\n- call view: status line with action + notebook path + cell/type metadata\n- result view:\n - success summary derived from `details`\n - `cellSource` rendered via `renderCodeCell`\n - markdown cells set language hint `markdown`; other cells have no explicit language override\n - collapsed code preview limit is `PREVIEW_LIMITS.COLLAPSED_LINES * 2`\n - supports expanded mode via shared render options\n - uses render cache keyed by width + expanded state\n\nError rendering assumption:\n\n- if first text content starts with `Error:`, renderer formats as notebook error block.\n\n## Python renderer (for actual execution output)\n\nKernel-backed execution rendering expects:\n\n- per-cell status transitions (`pending/running/complete/error`)\n- optional structured status event section\n- optional JSON output trees\n- truncation warnings + optional `artifact://` pointer\n\nThis renderer behavior is unrelated to `notebook` JSON editing results except that both reuse shared TUI primitives.\n\n## 8) Divergence from eval Python backend behavior\n\nIf \"plain Python execution\" means the `eval` tool with `language: \"python\"`:\n\n- `eval` executes code in a kernel, persists state by mode, streams chunks, captures rich displays, handles interrupts/timeouts, and supports output truncation/artifacts.\n- `notebook` performs deterministic notebook JSON mutations only; no execution, no kernel state, no chunk stream, no display outputs, no artifact pipeline.\n\nIf a workflow needs both:\n\n1. edit notebook source with `notebook`\n2. execute code cells via `eval` with `language: \"python\"` (manually passing code), not through `notebook`\n\nCurrent implementation does not provide a single tool that both mutates `.ipynb` and executes notebook cells through kernel context.\n", "plugin-manager-installer-plumbing.md": "# Plugin manager and installer plumbing\n\nThis document describes how `omp plugin` operations mutate plugin state on disk and how installed plugins become runtime capabilities (tools and extensions today, hooks/commands path resolution available).\n\n## Scope and architecture\n\nThere are two plugin-management implementations in the codebase:\n\n1. **Active path used by CLI commands**: `PluginManager` (`src/extensibility/plugins/manager.ts`)\n2. **Legacy helper module**: installer functions (`src/extensibility/plugins/installer.ts`)\n\n`omp plugin ...` command execution goes through `PluginManager`.\n\n`installer.ts` still documents important safety checks and filesystem behavior, but it is not the path used by `src/commands/plugin.ts` + `src/cli/plugin-cli.ts`.\n\n## Lifecycle: from CLI invocation to runtime availability\n\n```text\nomp plugin ...\n -> src/commands/plugin.ts\n -> runPluginCommand(...) in src/cli/plugin-cli.ts\n -> PluginManager method (install/list/uninstall/link/...)\n -> mutate ~/.omp/plugins/{package.json,node_modules,omp-plugins.lock.json}\n -> runtime discovery: discoverAndLoadCustomTools(...) and discoverAndLoadExtensions(...)\n -> getAllPluginToolPaths(cwd) / getAllPluginExtensionPaths(cwd)\n -> custom tool loader imports tool modules; extension loader imports extension modules\n```\n\n### Command entrypoints\n\n- `src/commands/plugin.ts` defines command/flags and forwards to `runPluginCommand`.\n- `src/cli/plugin-cli.ts` maps subcommands to `PluginManager` methods:\n - `install`, `uninstall`, `list`, `link`, `doctor`, `features`, `config`, `enable`, `disable`\n- No explicit `update` action exists; update is done by re-running `install` with a new package/version spec.\n\n## On-disk model\n\nGlobal plugin state lives under `~/.omp/plugins`:\n\n- `package.json` — dependency manifest used by `bun install`/`bun uninstall`\n- `node_modules/` — installed plugin packages or symlinks\n- `omp-plugins.lock.json` — runtime state:\n - enabled/disabled per plugin\n - selected feature set per plugin\n - persisted plugin settings\n\nProject-local overrides live at:\n\n- `/.omp/plugin-overrides.json`\n\nOverrides are read-only from manager/loader perspective (no write path here) and can disable plugins or override features/settings for this project.\n\n## Plugin spec parsing and metadata interpretation\n\n## Install spec grammar\n\n`parsePluginSpec` (`parser.ts`) supports:\n\n- `pkg` -> `features: null` (defaults behavior)\n- `pkg[*]` -> enable all manifest features\n- `pkg[]` -> enable no optional features\n- `pkg[a,b]` -> enable named features\n- `@scope/pkg@1.2.3[feat]` -> scoped + versioned package with explicit feature selection\n\n`extractPackageName` strips version suffix for on-disk path lookup after install.\n\n## Manifest source and required fields\n\nManifest is resolved as:\n\n1. `package.json.omp`\n2. fallback `package.json.pi`\n3. fallback `{ version: package.version }`\n\nImplications:\n\n- There is no strict schema validation in manager/loader.\n- A package missing `omp`/`pi` is still installable and listable.\n- Runtime plugin loading (`getEnabledPlugins`) skips packages without `omp`/`pi` manifest.\n- `manifest.version` is always overwritten from package `version`.\n\nMalformed `package.json` JSON is a hard failure at read time; malformed manifest shape may fail later only when specific fields are consumed.\n\n## Install/update flow (`PluginManager.install`)\n\n1. Parse feature bracket syntax from install spec.\n2. Validate package name against regex + shell-metacharacter denylist.\n3. Ensure plugin `package.json` exists (`omp-plugins`, private dependencies map).\n4. Run `bun install ` in `~/.omp/plugins`.\n5. Read installed package `node_modules//package.json`.\n6. Resolve manifest and compute `enabledFeatures`:\n - `[*]`: all declared features (or `null` if no feature map)\n - `[a,b]`: validates each feature exists in manifest features map\n - `[]`: empty feature list\n - bare spec: `null` (use defaults policy later in loader)\n7. Upsert lockfile runtime state: `{ version, enabledFeatures, enabled: true }`.\n\n### Update semantics\n\nBecause update is install-driven:\n\n- `omp plugin install pkg@newVersion` updates dependency and lockfile version.\n- Existing settings are preserved; state entry is overwritten for version/features/enabled.\n- No separate “check updates” or transactional migration logic exists.\n\n## Remove flow (`PluginManager.uninstall`)\n\n1. Validate package name.\n2. Run `bun uninstall ` in plugin dir.\n3. Remove plugin runtime state from lockfile:\n - `config.plugins[name]`\n - `config.settings[name]`\n\nIf uninstall command fails, runtime state is not changed.\n\n## List flow (`PluginManager.list`)\n\n1. Read plugin dependency map from `~/.omp/plugins/package.json`.\n2. Load lockfile runtime config (missing file -> empty defaults).\n3. Load project overrides (`/.omp/plugin-overrides.json`, parse/read errors -> empty object with warning).\n4. For each dependency with a resolvable package.json:\n - build `InstalledPlugin` record\n - merge feature/enable state:\n - base from lockfile (or defaults)\n - project overrides can replace feature selection\n - project `disabled` list masks plugin as disabled\n\nThis is the effective state used by CLI status output and settings/features operations.\n\n## Link flow (`PluginManager.link`)\n\n`link` supports local plugin development by symlinking a local package into `~/.omp/plugins/node_modules/`.\n\nBehavior:\n\n1. Resolve `localPath` against manager cwd.\n2. Require local `package.json` and `name` field.\n3. Ensure plugin dirs exist.\n4. For scoped names, create scope directory.\n5. Remove existing path at target link location.\n6. Create symlink.\n7. Add runtime lockfile entry enabled with default features (`null`).\n\nCaveat: current `PluginManager.link` does not enforce the `cwd` path-boundary check present in legacy `installer.ts` (`normalizedPath.startsWith(normalizedCwd)`), so trust is the caller’s responsibility.\n\n## Runtime loading: from installed plugin to callable capabilities\n\n## Discovery gate\n\n`getEnabledPlugins(cwd)` (`plugins/loader.ts`) reads:\n\n- plugin dependency manifest (`package.json`)\n- lockfile runtime state\n- project overrides via `getConfigDirPaths(\"plugin-overrides.json\", { user: false, cwd })`\n\nFiltering:\n\n- skip if no plugin package.json\n- skip if manifest (`omp`/`pi`) absent\n- skip if globally disabled in lockfile\n- skip if project-disabled\n\n## Capability path resolution\n\nFor each enabled plugin:\n\n- `resolvePluginExtensionPaths(plugin)`\n- `resolvePluginToolPaths(plugin)`\n- `resolvePluginHookPaths(plugin)`\n- `resolvePluginCommandPaths(plugin)`\n\nEach resolver includes base entries plus feature entries:\n\n- explicit feature list -> only selected features\n- `enabledFeatures === null` -> enable features marked `default: true`\n\nMissing files are silently skipped (`existsSync` guard).\n\n## Current runtime wiring differences\n\n- **Tools are wired into runtime today** via `discoverAndLoadCustomTools` (`custom-tools/loader.ts`), which calls `getAllPluginToolPaths(cwd)`.\n- **Extensions are wired into runtime today** via `discoverAndLoadExtensions` (`extensions/loader.ts`), which calls `getAllPluginExtensionPaths(cwd)`.\n- Paths are de-duplicated by resolved absolute path in custom tool and extension discovery (`seen` set, first path wins).\n- **Hooks/commands resolvers exist** and are exported, but this code path does not currently wire them into a runtime registry in the same way tools and extensions are wired.\n\n## Lock/state management details\n\n`PluginManager` caches runtime config in memory per instance (`#runtimeConfig`) and lazily loads once.\n\nLoad behavior:\n\n- lockfile missing -> `{ plugins: {}, settings: {} }`\n- lockfile read/parse failure -> warning + same empty defaults\n\nSave behavior:\n\n- writes full lockfile JSON pretty-printed each mutation\n\nNo cross-process locking or merge strategy exists; concurrent writers can overwrite each other.\n\n## Safety checks and trust boundaries\n\n## Input/package validation\n\nActive manager path enforces package-name validation:\n\n- regex for scoped/unscoped package specs (optionally with version)\n- explicit shell metacharacter denylist (`[;&|`$(){}[]<>\\\\]`)\n\nThis limits command-injection risk when invoking `bun install/uninstall`.\n\n## Filesystem trust boundary\n\n- Plugin code executes in-process when custom tool modules are imported; no sandboxing.\n- Manifest relative paths are joined against plugin package directory and only existence-checked.\n- The plugin package itself is trusted code once installed.\n\n## Legacy installer-only checks\n\n`installer.ts` includes additional link-time checks not mirrored in `PluginManager.link`:\n\n- local path must resolve inside project cwd\n- extra package name/path traversal guards for symlink target naming\n\nBecause CLI uses `PluginManager`, these stricter link guards are not currently on the main path.\n\n## Failure, partial success, and rollback behavior\n\nThe plugin manager is not transactional.\n\n| Operation stage | Failure behavior | Rollback |\n| -------------------------------------------------------- | -------------------------- | ----------------------------------------------------------------------------- |\n| `bun install` fails | install aborts with stderr | N/A (no state writes yet) |\n| Install succeeds, then manifest/feature validation fails | command fails | No uninstall rollback; dependency may remain in `node_modules`/`package.json` |\n| Install succeeds, then lockfile write fails | command fails | No rollback of installed package |\n| `bun uninstall` succeeds, lockfile write fails | command fails | Package removed, stale runtime state may remain |\n| `link` removes old target then symlink creation fails | command fails | No restoration of previous link/dir |\n\nOperationally, `doctor --fix` can repair some drift (`bun install`, orphaned config cleanup, invalid-feature cleanup), but it is best-effort.\n\n## Malformed/missing manifest behavior summary\n\n- Missing `omp`/`pi` field:\n - install/list: tolerated (minimal manifest)\n - runtime enabled-plugin discovery: skipped as non-plugin\n- Missing feature referenced by install spec or `features --set/--enable`: hard error with available feature list\n- Invalid `plugin-overrides.json`: ignored with fallback to `{}` in both manager and loader paths\n- Missing tool/hook/command file paths referenced by manifest: silently ignored during resolver expansion; flagged as errors only by `doctor`\n\n## Mode differences and precedence\n\n- `--dry-run` (install): returns synthetic install result, no filesystem/network/state writes.\n- `--json`: output formatting only, no behavior change.\n- Project overrides always take precedence over global lockfile for feature/settings view.\n- Effective enablement is `runtimeEnabled && !projectDisabled`.\n\n## Implementation files\n\n- [`src/commands/plugin.ts`](../packages/coding-agent/src/commands/plugin.ts) — CLI command declaration and flag mapping\n- [`src/cli/plugin-cli.ts`](../packages/coding-agent/src/cli/plugin-cli.ts) — action dispatch, user-facing command handlers\n- [`src/extensibility/plugins/manager.ts`](../packages/coding-agent/src/extensibility/plugins/manager.ts) — active install/remove/list/link/state/doctor implementation\n- [`src/extensibility/plugins/installer.ts`](../packages/coding-agent/src/extensibility/plugins/installer.ts) — legacy installer helpers and additional link safety checks\n- [`src/extensibility/plugins/loader.ts`](../packages/coding-agent/src/extensibility/plugins/loader.ts) — enabled-plugin discovery and tool/hook/command path resolution\n- [`src/extensibility/plugins/parser.ts`](../packages/coding-agent/src/extensibility/plugins/parser.ts) — install spec and package-name parsing helpers\n- [`src/extensibility/plugins/types.ts`](../packages/coding-agent/src/extensibility/plugins/types.ts) — manifest/runtime/override type contracts\n- [`src/extensibility/custom-tools/loader.ts`](../packages/coding-agent/src/extensibility/custom-tools/loader.ts) — runtime wiring for plugin-provided tool modules\n- [`src/extensibility/extensions/loader.ts`](../packages/coding-agent/src/extensibility/extensions/loader.ts) — runtime wiring for plugin-provided extension modules\n", "porting-from-pi-mono.md": "# Porting From pi-mono: A Practical Merge Guide\n\nThis guide is a repeatable checklist for porting changes from pi-mono into this repo.\nUse it for any merge: single file, feature branch, or full release sync.\n\n## Last Sync Point (historical upstream marker)\n\n**Commit:** `b21b42d032919de2f2e6920a76fa9a37c3920c0a`\n**Date:** 2026-03-22\n\nUpdate this section after each sync; do not reuse the previous range. This commit is an upstream pi-mono marker and may not exist in this repo's local object database.\n\nWhen starting a new sync, generate patches from this commit forward in a pi-mono checkout or remote that contains the commit:\n\n```bash\ngit format-patch b21b42d032919de2f2e6920a76fa9a37c3920c0a..HEAD --stdout > changes.patch\n```\n\n## 0) Define the scope\n\n- Identify the upstream reference (commit, tag, or PR).\n- List the packages or folders you plan to touch.\n- Decide which features are in-scope and which are intentionally skipped.\n\n## 1) Bring code over safely\n\n- Prefer a clean, focused diff rather than a wholesale copy.\n- Avoid copying built artifacts or generated files.\n- If upstream added new files, add them explicitly and review contents.\n\n## 2) Match import extension conventions\n\nMost runtime TypeScript sources omit `.js` in internal imports, but several current entrypoints and tool modules keep `.js` for ESM/runtime compatibility. Follow the surrounding file and package export style; do not blanket-strip or blanket-add extensions.\n\n- In `packages/coding-agent` runtime sources, prefer extensionless internal imports when the surrounding module does, but preserve existing `.js` imports in files that already require them.\n- In `packages/tui/test` and `packages/natives/bench`, keep `.js` where surrounding files already use it.\n- Keep real file extensions when required by tooling or import assertions (e.g., `.json`, `.css`, `.md` text embeds).\n- Example: `import { x } from \"./foo.js\";` → `import { x } from \"./foo\";` only when that package/file convention is extensionless.\n\n## 3) Replace import scopes\n\nUpstream uses different package scopes. Replace them consistently.\n\n- Replace old scopes with the local scope used here.\n- Examples (adjust to match the actual packages you are porting):\n - `@mariozechner/pi-coding-agent` → `@oh-my-pi/pi-coding-agent`\n - `@mariozechner/pi-agent-core` → `@oh-my-pi/pi-agent-core`\n - `@mariozechner/pi-tui` → `@oh-my-pi/pi-tui`\n - `@mariozechner/pi-ai` → `@oh-my-pi/pi-ai`\n\n## 4) Use Bun APIs where they improve on Node\n\nWe run on Bun, but the current source intentionally mixes Bun APIs with small Node standard-library APIs. Replace Node APIs only when Bun provides a clearer, safer, or simpler implementation; do not mechanically rewrite every Node import.\n\n**Prefer replacing when porting new code:**\n\n- Process spawning: prefer Bun Shell `$` for simple commands; use `Bun.spawn`/`Bun.spawnSync` for streaming or process control. Keep existing `child_process` only where its exact semantics are needed.\n- HTTP clients: `node-fetch`, `axios` → native `fetch`\n- SQLite: `better-sqlite3` → `bun:sqlite`\n- Env loading: `dotenv` → Bun loads `.env` automatically\n- Runtime text/assets: prefer Bun imports such as `with { type: \"text\" }` or `Bun.file()` over copy steps or bundled fallback file reads.\n\n**DO NOT replace (these work fine in Bun):**\n\n- `os.homedir()` — do NOT replace with `Bun.env.HOME` or literal `\"~\"`\n- `os.tmpdir()` — do NOT replace with `Bun.env.TMPDIR || \"/tmp\"` or hardcoded paths\n- `fs.mkdtempSync()` — do NOT replace with manual path construction\n- `path.join()`, `path.resolve()`, etc. — these are fine\n\n**Import style:** Use the `node:` prefix for Node standard-library imports. Namespace imports are common, but named imports are acceptable where the surrounding code already uses them.\n\n**Additional Bun conventions:**\n\n- Prefer Bun Shell `$` for short, non-streaming commands; use `Bun.spawn` only when you need streaming I/O or process control.\n- Use `Bun.file()`/`Bun.write()` for simple files and `node:fs/promises` for directory-oriented operations. Existing synchronous `node:fs` calls are acceptable when the calling flow is intentionally synchronous.\n- Avoid `Bun.file().exists()` checks; use `isEnoent` handling in try/catch.\n- Prefer `Bun.sleep(ms)` over `setTimeout` wrappers.\n\n**Wrong:**\n\n```typescript\n// BROKEN: env vars may be undefined, \"~\" is not expanded\nconst home = Bun.env.HOME || \"~\";\nconst tmp = Bun.env.TMPDIR || \"/tmp\";\n```\n\n**Correct:**\n\n```typescript\nimport * as os from \"node:os\";\nimport * as fs from \"node:fs\";\nimport * as path from \"node:path\";\n\nconst configDir = path.join(os.homedir(), \".config\", \"myapp\");\nconst tempDir = fs.mkdtempSync(path.join(os.tmpdir(), \"myapp-\"));\n```\n\n## 5) Prefer Bun embeds (no copying)\n\nDo not add new runtime asset copy steps. Keep assets in repo and prefer Bun embeds/imports; preserve existing explicit generation workflows such as `packages/coding-agent/src/export/html/template.generated.ts`.\n\n- If upstream copies assets into a dist folder, replace with Bun-friendly embeds.\n- Prompts are static `.md` files; use Bun text imports (`with { type: \"text\" }`) and Handlebars instead of inline prompt strings.\n- Use `import.meta.dir` + `Bun.file` to load adjacent non-text resources.\n- Keep assets in-repo and let the bundler include them.\n- Eliminate copy scripts unless the user explicitly requests them or the package already has an intentional generation step.\n- If upstream reads a bundled fallback file at runtime, replace filesystem reads with a Bun text embed import unless the current package already uses a generated asset pipeline.\n - Example (Codex instructions fallback):\n - `const FALLBACK_PROMPT_PATH = join(import.meta.dir, \"codex-instructions.md\");` -> removed\n - `import FALLBACK_INSTRUCTIONS from \"./codex-instructions.md\" with { type: \"text\" };`\n - Use `return FALLBACK_INSTRUCTIONS;` instead of `readFileSync(FALLBACK_PROMPT_PATH, \"utf8\")`\n\n## 6) Port `package.json` carefully\n\nTreat `package.json` as a contract. Merge intentionally.\n\n- Keep existing `name`, `version`, `type`, `exports`, and `bin` unless the port requires changes.\n- Replace npm/node scripts with Bun equivalents (e.g., `bun check`, `bun test`).\n- Ensure dependencies use the correct scope.\n- Do not downgrade dependencies to fix type errors; upgrade instead.\n- Validate workspace package links and `peerDependencies`.\n\n## 7) Align code style and tooling\n\n- Keep existing formatting conventions.\n- Do not introduce `any` unless required.\n- Avoid dynamic imports unless they are required for optional dependencies, startup cost, or runtime-only modules; prefer top-level imports otherwise.\n- Never build prompts in code; prompts are static `.md` files rendered with Handlebars.\n- In `packages/coding-agent`, use `logger` from `@oh-my-pi/pi-utils` for internal/runtime logging; CLI command files may use `console.*` for intentional user-facing output.\n- Use `Promise.withResolvers()` instead of `new Promise((resolve, reject) => ...)`.\n- Prefer ES `#` private fields for new encapsulated state. Constructor parameter properties already exist in current code and are acceptable; do not churn unrelated access modifiers while porting.\n- Prefer existing helpers and utilities over new ad-hoc code.\n Preserve Bun-first infrastructure changes already made in this repo:\n - Runtime is Bun (no Node entry points for the main CLI).\n - Package manager is Bun (no npm lockfiles).\n - Heavy Node APIs should not be introduced casually; current source still uses selected Node APIs (`node:crypto`, `node:readline`, synchronous `node:fs`, and `child_process`) where they fit provider, CLI, or process-control semantics.\n - Lightweight Node APIs (`os.homedir`, `os.tmpdir`, `fs.mkdtempSync`, `path.*`) are kept.\n - CLI shebangs use `bun` (not `node`, not `tsx`).\n - TypeScript packages generally use source files directly; `@oh-my-pi/pi-natives` exports generated native bindings from `packages/natives/native`.\n - CI workflows run Bun for install/check/test.\n\n## 8) Remove old compatibility layers\n\nUnless requested, remove upstream compatibility shims.\n\n- Delete old APIs that were replaced.\n- Update all call sites to the new API directly.\n- Do not keep `*_v2` or parallel versions.\n\n## 9) Update docs and references\n\n- Replace pi-mono repo links where appropriate.\n- Update examples to use Bun and correct package scopes.\n- Ensure README instructions still match the current repo behavior.\n\n## 10) Validate the port\n\nRun the standard checks after changes:\n\n- `bun check`\n\nIf the repo already has failing checks unrelated to your changes, call that out.\nTests use Bun's runner (not Vitest), but only run `bun test` when explicitly requested.\n\n## 11) Protect improved features (regression trap list)\n\nIf you already improved behavior locally, treat those as **non‑negotiable**. Before porting, write down\nthe improvements and add explicit checks so they don’t get lost in the merge.\n\n- **Freeze the expected behavior**: add a short “before/after” note for each improvement (inputs, outputs,\n defaults, edge cases). This prevents silent rollback.\n- **Map old → new APIs**: if upstream renamed concepts (hooks → extensions, custom tools → tools, etc.),\n ensure every old entry point still wires through. One missed flag or export equals lost functionality.\n- **Verify exports**: check `package.json` `exports`, public types, and barrel files. Upstream ports often\n forget to re-export local additions.\n- **Cover non‑happy paths**: if you fixed error handling, timeouts, or fallback logic, add a test or at\n least a manual checklist that exercises those paths.\n- **Check defaults and config merge order**: improvements often live in defaults. Confirm new defaults\n didn’t revert (e.g., new config precedence, disabled features, tool lists).\n- **Audit env/shell behavior**: if you fixed execution or sandboxing, verify the new path still uses your\n sanitized env and does not reintroduce alias/function overrides.\n- **Re-run targeted samples**: keep a minimal set of \"known good\" examples and run them after the port\n (CLI flags, extension registration, tool execution).\n\n## 12) Detect and handle reworked code\n\nBefore porting a file, check if upstream significantly refactored it:\n\n```bash\n# Compare the file you're about to port against what you have locally\ngit diff HEAD upstream/main -- path/to/file.ts\n```\n\nIf the diff shows the file was **reworked** (not just patched):\n\n- New abstractions, renamed concepts, merged modules, changed data flow\n\nThen you must **read the new implementation thoroughly** before porting. Blind merging of reworked code loses functionality because:\n\nNote: interactive mode was recently split into controllers/utils/types. When backporting related changes, port updates into the individual files we created and ensure `interactive-mode.ts` wiring stays in sync.\n\n1. **Defaults change silently** - A new variable `defaultFoo = [a, b]` may replace an old `getAllFoo()` that returned `[a, b, c, d, e]`.\n\n2. **API options get dropped** - When systems merge (e.g., `hooks` + `customTools` → `extensions`), old options may not wire through to the new implementation.\n\n3. **Code paths go stale** - A renamed concept (e.g., `hookMessage` → `custom`) needs updates in every switch statement, type guard, and handler—not just the definition.\n\n4. **Context/capabilities shrink** - Old APIs may have exposed `{ logger, typebox, pi }` that new APIs forgot to include.\n\n### Semantic porting process\n\nWhen upstream reworked a module:\n\n1. **Read the old implementation** - Understand what it did, what options it accepted, what it exposed.\n\n2. **Read the new implementation** - Understand the new abstractions and how they map to old behavior.\n\n3. **Verify feature parity** - For each capability in the old code, confirm the new code preserves it or explicitly removes it.\n\n4. **Grep for stragglers** - Search for old names/concepts that may have been missed in switch statements, handlers, UI components.\n\n5. **Test the boundaries** - CLI flags, SDK options, event handlers, default values—these are where regressions hide.\n\n### Quick checks\n\n```bash\n# Find all uses of an old concept that may need updating\nrg \"oldConceptName\" --type ts\n\n# Compare default values between versions\ngit show upstream/main:path/to/file.ts | rg \"default|DEFAULT\"\n\n# Check if all enum/union values have handlers\nrg \"case \\\"\" path/to/file.ts\n```\n\n## 13) Quick audit checklist\n\nUse this as a final pass before you finish:\n\n- [ ] Import extensions follow the local package convention (no blanket `.js` stripping)\n- [ ] No newly introduced Node-only APIs unless they match an existing justified pattern\n- [ ] All package scopes updated\n- [ ] `package.json` scripts use Bun\n- [ ] Prompts are `.md` text imports (no inline prompt strings)\n- [ ] No internal/runtime `console.*` in coding-agent; CLI user-facing output is intentional\n- [ ] Assets load via Bun embed/import patterns, or through an existing intentional generation pipeline\n- [ ] Tests or checks run (or explicitly noted as blocked)\n- [ ] No functionality regressions (see sections 11-12)\n\n## 14) Commit message format\n\nWhen committing a backport, follow the repo format `(scope): ` and keep the commit\nrange in the title.\n\n```\nfix(coding-agent): backported pi-mono changes (..)\n\npackages/:\n- : \n- : (# by @)\n\npackages/:\n- : \n```\n\n**Example:**\n\n```\nfix(coding-agent): backported pi-mono changes (9f3eef65f..52532c7c0)\n\npackages/ai:\n- fix: handle \"sensitive\" stop reason from Anthropic API\n- fix: normalize tool call IDs with special characters for Responses API\n- fix: add overflow detection for Bedrock, MiniMax, Kimi providers\n- fix: 429 status is rate limiting, not context overflow\n\npackages/tui:\n- fix: refactored autocomplete state tracking\n- fix: file autocomplete should not trigger on empty text\n- fix: configurable autocomplete max visible items\n- fix: improved table column width calculation with word-aware wrapping\n\npackages/coding-agent:\n- fix: preserve external config.yml edits on save (#1046 by @nicobailonMD)\n- fix: resolve macOS NFD and curly quote variants in file paths\n```\n\n**Rules:**\n\n- Group changes by package\n- Use conventional commit types (`fix`, `feat`, `refactor`, `perf`, `docs`)\n- Include upstream issue/PR numbers and contributor attribution for external contributions\n- The commit range in the title helps track sync points\n\n## 15) Intentional Divergences\n\nOur fork has architectural decisions that differ from upstream. **Do not port these upstream patterns:**\n\n### UI Architecture\n\n| Upstream | Our Fork | Reason |\n| ------------------------------------------- | --------------------------------------------------------- | --------------------------------------------------------------------- |\n| `FooterDataProvider` class | `StatusLineComponent` | Simpler, integrated status line |\n| `ctx.ui.setHeader()` / `ctx.ui.setFooter()` | No-op stubs in current extension contexts | Not currently wired to replace the TUI status/header UI |\n| `ctx.ui.setEditorComponent()` | No-op stubs in current extension contexts | Custom editor replacement is not currently wired |\n| `InteractiveModeOptions` options object | Positional constructor args (options type still exported) | Keep constructor signature; update the type when upstream adds fields |\n\n### Component Naming\n\n| Upstream | Our Fork |\n| ---------------------------- | ----------------------- |\n| `extension-input.ts` | `hook-input.ts` |\n| `extension-selector.ts` | `hook-selector.ts` |\n| `ExtensionInputComponent` | `HookInputComponent` |\n| `ExtensionSelectorComponent` | `HookSelectorComponent` |\n\n### API Naming\n\n| Upstream | Our Fork | Notes |\n| ---------------------------------------- | ---------------------------------------- | ----------------------------------------- |\n| `sessionManager.appendSessionInfo(name)` | `sessionManager.setSessionName(name)` | We use `sessionName` throughout |\n| `sessionManager.getSessionName()` | `sessionManager.getSessionName()` | Same (we unified to match upstream's RPC) |\n| `agent.sessionName` / `setSessionName()` | `agent.sessionName` / `setSessionName()` | Same |\n\n### File Consolidation\n\n| Upstream | Our Fork | Reason |\n| -------------------------------------------------- | --------------------------------------------------------- | --------------------------------------------- |\n| `clipboard.ts` + `clipboard-image.ts` (tool files) | `src/utils/clipboard.ts` backed by `@oh-my-pi/pi-natives` | Native implementation with a small TS wrapper |\n\n### Test Framework\n\n| Upstream | Our Fork |\n| ------------------------- | ----------------------------- |\n| `vitest` with `vi.mock()` | `bun:test` with `vi` from bun |\n| `node:test` assertions | `expect()` matchers |\n\n### Tool Architecture\n\n| Upstream | Our Fork | Notes |\n| ----------------------------------- | ------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------- |\n| `createTool(cwd: string, options?)` | `createTools(session: ToolSession)` via `BUILTIN_TOOLS` registry | Tool factories accept `ToolSession` and can return `null` |\n| Per-tool `*Operations` interfaces | Only current per-tool override interfaces remain (for example `FindOperations`) | Used for SSH/remote overrides where present |\n| Node.js `fs/promises` everywhere | Bun file APIs for simple file writes/reads, `node:fs/promises` for dirs, selected sync `node:fs` where needed | Prefer Bun APIs when they simplify |\n\n### Auth Storage\n\n| Upstream | Our Fork | Notes |\n| ------------------------------- | ------------------------------------------- | -------------------------------------------- |\n| `proper-lockfile` + `auth.json` | `agent.db` (bun:sqlite) | Credentials stored exclusively in `agent.db` |\n| Single credential per provider | Multi-credential with round-robin selection | Session affinity and backoff logic preserved |\n\n### Extensions\n\n| Upstream | Our Fork |\n| ----------------------------- | ------------------------------------------------- |\n| `jiti` for TypeScript loading | Native Bun `import()` |\n| `pkg.pi` manifest field | `pkg.omp` preferred; fallback to `pkg.pi` remains |\n\n### Skip These Upstream Features\n\nWhen porting, **skip** these files/features entirely:\n\n- `footer-data-provider.ts` — we use StatusLineComponent\n- `clipboard-image.ts` — image clipboard support is exposed through `src/utils/clipboard.ts` backed by `@oh-my-pi/pi-natives`\n- GitHub workflow files — we have our own CI\n- `models.generated.ts` — auto-generated, regenerate locally (as models.json instead)\n\n### Features We Added (Preserve These)\n\nThese exist in our fork but not upstream. **Never overwrite:**\n\n- `StatusLineComponent` in interactive mode\n- Multi-credential auth with session affinity\n- Capability-based discovery system (`defineCapability`, `registerProvider`, `loadCapability`, `skillCapability`, etc.)\n- MCP/Exa/SSH integrations\n- LSP writethrough for format-on-save\n- Bash interception (`checkBashInterception`)\n- Fuzzy path suggestions in read tool\n", "porting-to-natives.md": "# Porting to pi-natives (N-API) — Field Notes\n\nThis is a practical guide for moving hot paths into `crates/pi-natives` and wiring them through the generated native package entrypoint. It exists to avoid the same failures happening twice.\n\n## When to port\n\nPort when any of these are true:\n\n- The hot path runs in render loops, tight UI updates, or large batches.\n- JS allocations dominate (string churn, regex backtracking, large arrays).\n- You already have a JS baseline and can benchmark both versions side by side.\n- The work is CPU-bound or blocking I/O that can run on the libuv thread pool.\n- The work is async I/O that can run on Tokio's runtime (for example shell execution).\n\nAvoid ports that depend on JS-only state or dynamic imports. N-API exports should be data-in/data-out. Long-running work should go through `task::blocking` (CPU-bound/blocking I/O) or `task::future` (async I/O) with cancellation where the caller needs `timeoutMs` or `AbortSignal`.\n\n## Current package shape\n\n`@oh-my-pi/pi-natives` no longer has a `packages/natives/src/` TypeScript wrapper layer. The package root points at generated native artifacts:\n\n- runtime entry: `packages/natives/native/index.js`\n- types entry: `packages/natives/native/index.d.ts`\n- loader helpers: `packages/natives/native/loader-state.js`\n- embedded manifest: `packages/natives/native/embedded-addon.js`\n\nConsumers import directly from `@oh-my-pi/pi-natives`. The generated declarations are produced during `bun --cwd=packages/natives run build`.\n\n## Anatomy of a native export\n\n**Rust side:**\n\n- Implementation lives in `crates/pi-natives/src/.rs`.\n- If you add a new module, register it in `crates/pi-natives/src/lib.rs`.\n- Export with `#[napi]`; snake_case exports are converted to camelCase automatically. Use explicit JS names only for true aliases/non-default names. Use `#[napi(object)]` for object-shaped structs.\n- For CPU-bound or blocking work, use `task::blocking(tag, cancel_token, work)`.\n- For async work that needs Tokio, use `task::future(env, tag, work)`.\n- Pass a `CancelToken` when the API exposes `timeoutMs` or `AbortSignal`, and call `heartbeat()` inside long loops.\n\n**Package/build side:**\n\n- `packages/natives/scripts/build-native.ts` runs napi-rs, installs the `.node` artifact, copies generated `index.js`/`index.d.ts`, and appends enum runtime exports.\n- `packages/natives/native/index.js` is the loader that chooses a candidate `.node` file and returns the loaded addon.\n- `packages/natives/package.json` exposes only the package root (`@oh-my-pi/pi-natives`).\n\n**Consumer side:**\n\n- Update direct imports/callsites in `packages/coding-agent` or `packages/tui` when the new export replaces a JS implementation.\n- Keep higher-level policy in consumers unless it belongs in the native primitive itself.\n\n## Porting checklist\n\n1. **Add the Rust implementation**\n\n- Put the core logic in a plain Rust function.\n- If it is a new module, add it to `crates/pi-natives/src/lib.rs`.\n- Expose it with `#[napi]` so the default snake_case -> camelCase mapping stays consistent.\n- Keep signatures owned and simple: `String`, `Vec`, `Uint8Array`, `Either`, or `#[napi(object)]` structs.\n- For CPU-bound or blocking work, use `task::blocking`; for async work, use `task::future`.\n- If exposing cancellation, include `timeout_ms: Option` and `signal: Option>` in options, create `CancelToken::new(...)`, and heartbeat in long loops.\n\n2. **Build generated bindings**\n\n- Run `bun --cwd=packages/natives run build`.\n- Confirm the generated `packages/natives/native/index.d.ts` includes the new export with the intended JS name/signature.\n- Confirm `packages/natives/native/index.js` still has generated enum exports appended when enum changes are involved.\n\n3. **Update consumers**\n\n- Import the new export directly from `@oh-my-pi/pi-natives`.\n- Replace only callsites where the native implementation is faster/equivalent and preserves behavior.\n- Remove obsolete JS implementation code in the same change when the native path becomes canonical.\n\n4. **Add benchmarks**\n\n- Put benchmarks next to the owning package (`packages/tui/bench`, `packages/natives/bench`, or `packages/coding-agent/bench`).\n- Include a JS baseline and native version in the same run.\n- Use `Bun.nanoseconds()` and a fixed iteration count.\n- Keep benchmark inputs realistic for the hot path.\n\n5. **Run focused verification**\n\n- Build the native package.\n- Run the benchmark.\n- Run the narrow tests or scenario covering the changed export/callsites.\n\n## Pain points and how to avoid them\n\n### 1) Stale platform/variant artifacts\n\nThe loader probes platform-tagged artifacts in deterministic order. For x64, selected variant candidates are tried before the unsuffixed default fallback:\n\n- `modern`: `pi_natives.-modern.node`, then `...-baseline.node`, then `pi_natives..node`.\n- `baseline`: `pi_natives.-baseline.node`, then `pi_natives..node`.\n\nNon-x64 uses `pi_natives..node`.\n\nCompiled binaries also probe `//...` and a legacy user-data directory before package/executable locations. If any earlier candidate is stale, a new export may appear missing.\n\n**Fix:** remove stale candidate/cache files and rebuild.\n\n```bash\nrm packages/natives/native/pi_natives.-.node\nrm packages/natives/native/pi_natives.--modern.node\nrm packages/natives/native/pi_natives.--baseline.node\nbun --cwd=packages/natives run build\n```\n\nFor compiled binaries, delete the versioned addon cache shown in the loader error (normally under `~/.omp/natives/` unless `$XDG_DATA_HOME/omp` is used).\n\n### 2) Generated types do not match loaded binary\n\nThis can happen when `native/index.d.ts` was regenerated but the `.node` file being loaded is stale or from a different platform/variant.\n\nVerify the loaded export set from the actual candidate path:\n\n```bash\nbun -e 'const tag = `${process.platform}-${process.arch}`; const mod = require(`./packages/natives/native/pi_natives.${tag}.node`); console.log(Object.keys(mod).sort())'\n```\n\nFix the build/candidate mismatch. Do not paper over it with optional consumer checks if the export is required.\n\n### 3) Rust signature mismatch\n\nKeep N-API signatures simple and owned. Avoid borrowed references like `&str` in public exports. If you need structured data, use `#[napi(object)]` structs. If you need callbacks, use napi-rs `ThreadsafeFunction` and keep callback error/value behavior explicit.\n\n### 4) Enum runtime exports\n\nnapi-rs declarations alone are not enough for JS callers that use enum objects at runtime. `scripts/gen-enums.ts` appends enum objects to `native/index.js`. If you add or change a native enum, verify both `native/index.d.ts` and the generated enum export block in `native/index.js`.\n\n### 5) Benchmarking mistakes\n\n- Do not compare different inputs or allocations.\n- Keep JS and native using identical input arrays.\n- Run both in the same benchmark file to avoid skew.\n- Include enough iterations to smooth startup noise, but keep inputs realistic.\n\n## Benchmark template\n\n```ts\nconst ITERATIONS = 2000;\n\nfunction bench(name: string, fn: () => void): number {\n const start = Bun.nanoseconds();\n for (let i = 0; i < ITERATIONS; i++) fn();\n const elapsed = (Bun.nanoseconds() - start) / 1e6;\n console.log(\n `${name}: ${elapsed.toFixed(2)}ms total (${(elapsed / ITERATIONS).toFixed(6)}ms/op)`,\n );\n return elapsed;\n}\n\nbench(\"feature/js\", () => {\n jsImpl(sample);\n});\n\nbench(\"feature/native\", () => {\n nativeImpl(sample);\n});\n```\n\n## Verification checklist\n\n- Generated `native/index.d.ts` includes the new export and intended TS signature.\n- The loaded `.node` file's `Object.keys(require(candidate))` includes the new export.\n- Runtime enum objects are present when the change adds/changes enums.\n- Bench numbers are recorded in the PR/notes.\n- Call sites are updated only if native is faster/equal and behavior-compatible.\n- Obsolete JS code is removed when the native implementation becomes canonical.\n\n## Rule of thumb\n\n- If native is slower, do not switch callsites. Keep or remove the export based on whether it has a near-term owner.\n- If native is faster and behavior-compatible, switch callsites and keep a benchmark to catch regressions.\n", "provider-streaming-internals.md": "# Provider streaming internals\n\nThis document explains how token/tool streaming is normalized in `@oh-my-pi/pi-ai`, then propagated through `@oh-my-pi/pi-agent-core` and `coding-agent` session events.\n\n## End-to-end flow\n\n1. `streamSimple()` (`packages/ai/src/stream.ts`) maps generic options and dispatches to a provider stream function.\n2. Provider stream functions translate provider-native stream events into the unified `AssistantMessageEvent` sequence. Current built-ins include Anthropic, OpenAI Responses/Completions/Codex/Azure Responses, Google Gemini/Gemini CLI/Vertex, Bedrock Converse, Ollama, Cursor, plus GitLab Duo/Kimi wrappers and extension-registered custom APIs.\n3. Each provider pushes events into `AssistantMessageEventStream` (`packages/ai/src/utils/event-stream.ts`), which throttles delta events and exposes:\n - async iteration for incremental updates\n - `result()` for final `AssistantMessage`\n4. `agentLoop` (`packages/agent/src/agent-loop.ts`) consumes those events, mutates in-flight assistant state, and emits `message_update` events carrying the raw `assistantMessageEvent`.\n5. `AgentSession` (`packages/coding-agent/src/session/agent-session.ts`) subscribes to agent events, persists messages, drives extension hooks, and applies session behaviors (retry, compaction, TTSR, streaming-edit abort checks).\n\n## Unified stream contract in `@oh-my-pi/pi-ai`\n\nAll providers emit the same shape (`AssistantMessageEvent` in `packages/ai/src/types.ts`):\n\n- `start`\n- content block lifecycle triplets:\n - text: `text_start` → `text_delta`\\* → `text_end`\n - thinking: `thinking_start` → `thinking_delta`\\* → `thinking_end`\n - tool call: `toolcall_start` → `toolcall_delta`\\* → `toolcall_end`\n- terminal event:\n - `done` with `reason: \"stop\" | \"length\" | \"toolUse\"`\n - or `error` with `reason: \"aborted\" | \"error\"`\n\n`AssistantMessageEventStream` guarantees:\n\n- final result is resolved by terminal event (`done` or `error`)\n- deltas are batched/throttled (~50ms)\n- buffered deltas are flushed before non-delta events and before completion\n\n## Delta throttling and harmonization behavior\n\n`AssistantMessageEventStream` treats `text_delta`, `thinking_delta`, and `toolcall_delta` as mergeable events:\n\n- buffered deltas are merged only when **type + contentIndex** match\n- merge keeps the latest `partial` snapshot\n- non-delta events force immediate flush\n\nThis smooths high-frequency provider streams for TUI/event consumers, but is not provider backpressure: providers still produce at full speed, while the local stream buffers.\n\n## Provider normalization details\n\n## Anthropic (`anthropic-messages`)\n\nSource: `packages/ai/src/providers/anthropic.ts`\n\nNormalization points:\n\n- `message_start` initializes usage (input/output/cache tokens)\n- `content_block_start` maps to text/thinking/toolcall starts\n- `content_block_delta` maps:\n - `text_delta` → `text_delta`\n - `thinking_delta` → `thinking_delta`\n - `input_json_delta` → `toolcall_delta`\n - `signature_delta` updates `thinkingSignature` only (no event)\n- `content_block_stop` emits corresponding `*_end`\n- `message_delta.stop_reason` maps via `mapStopReason()`\n\nTool-call argument streaming:\n\n- each tool block carries internal `partialJson`\n- every JSON delta appends to `partialJson`\n- `arguments` are reparsed on each delta via `parseStreamingJson()`\n- `toolcall_end` reparses once more, then strips `partialJson`\n\n## OpenAI Responses family (`openai-responses`, `openai-codex-responses`, `azure-openai-responses`)\n\nSources: `packages/ai/src/providers/openai-responses.ts`, `openai-codex-responses.ts`, and `azure-openai-responses.ts`\n\nNormalization points:\n\n- `response.output_item.added` starts reasoning/text/function-call blocks\n- reasoning summary events (`response.reasoning_summary_text.delta`) become `thinking_delta`\n- output/refusal deltas become `text_delta`\n- `response.function_call_arguments.delta` becomes `toolcall_delta`\n- `response.output_item.done` emits `thinking_end` / `text_end` / `toolcall_end`\n- `response.completed` maps status to stop reason and usage\n\nTool-call argument streaming:\n\n- same `partialJson` accumulation pattern as Anthropic\n- providers that send only `response.function_call_arguments.done` still populate final args\n- tool call IDs are normalized as `\"|\"`\n\n## Google Generative AI (`google-generative-ai`)\n\nSource: `packages/ai/src/providers/google.ts`\n\nNormalization points:\n\n- iterates `candidate.content.parts`\n- text parts are split into thinking vs text by `isThinkingPart(part)`\n- block transitions close previous block before starting a new one\n- `part.functionCall` is treated as a complete tool call (start/delta/end emitted immediately)\n- finish reason mapped by `mapStopReason()` from `google-shared.ts`\n\nTool-call argument streaming:\n\n- function call args arrive as structured object, not incremental JSON text\n- implementation emits one synthetic `toolcall_delta` containing `JSON.stringify(arguments)`\n- no partial JSON parser needed for Google in this path\n\n## Partial tool-call JSON accumulation and recovery\n\nShared behavior for Anthropic/OpenAI Responses uses `parseStreamingJson()` (`packages/ai/src/utils/json-parse.ts`):\n\n1. try `JSON.parse`\n2. fallback to `partial-json` parser for incomplete fragments\n3. if both fail, return `{}`\n\nImplications:\n\n- malformed or truncated argument deltas do not crash stream processing immediately\n- in-progress `arguments` may temporarily be `{}`\n- later valid deltas can recover structured arguments because parsing is retried on every append\n- final `toolcall_end` performs one more parse attempt before emission\n\n## Stop reasons vs transport/runtime errors\n\nProvider stop reasons are mapped to normalized `stopReason`:\n\n- Anthropic: `end_turn`→`stop`, `max_tokens`→`length`, `tool_use`→`toolUse`, safety/refusal cases→`error`\n- OpenAI Responses: `completed`→`stop`, `incomplete`→`length`, `failed/cancelled`→`error`\n- Google: `STOP`→`stop`, `MAX_TOKENS`→`length`, safety/prohibited/malformed-function-call classes→`error`\n\nError semantics are split in two stages:\n\n1. **Model completion semantics** (provider reported finish reason/status)\n2. **Transport/runtime failure** (network/client/parser/abort exceptions)\n\nIf provider stream throws or signals failure, each provider wrapper catches and emits terminal `error` event with:\n\n- `stopReason = \"aborted\"` when abort signal is set\n- otherwise `stopReason = \"error\"`\n- `errorMessage = formatErrorMessageWithRetryAfter(error)`\n\n## Malformed chunk / SSE parse failure behavior\n\nFor these provider paths, chunk/SSE framing is handled by vendor SDK streams (Anthropic SDK, OpenAI SDK, Google SDK). This code does not implement a custom SSE decoder here.\n\nObserved behavior in current implementation:\n\n- malformed chunk/SSE parsing at SDK level surfaces as an exception or stream `error` event\n- provider wrapper converts that into unified terminal `error` event\n- no provider-specific resume/retry inside the stream function itself\n- higher-level retries are handled in `AgentSession` auto-retry logic (message-level retry, not stream-chunk replay)\n\n## Cancellation boundaries\n\nCancellation is layered:\n\n- AI provider request: `options.signal` is passed into provider client stream call.\n- Provider wrapper: after stream loop, aborted signal forces error path (`\"Request was aborted\"`).\n- Agent loop: checks `signal.aborted` before handling each provider event and can synthesize an aborted assistant message from the latest partial.\n- Session/agent controls: `AgentSession.abort()` -> `agent.abort()` -> shared abort controller cancellation.\n\nTool execution cancellation is separate from model stream cancellation:\n\n- tool runners use `AbortSignal.any([agentSignal, steeringAbortSignal])`\n- steering interrupts can abort remaining tool execution while preserving already-produced tool results\n\n## Backpressure boundaries\n\nThere is no hard backpressure mechanism between provider SDK stream and downstream consumers:\n\n- `EventStream` uses in-memory queues with no max size\n- throttling reduces UI update rate but does not slow provider intake\n- if consumers lag significantly, queued events can grow until completion\n\nCurrent design favors responsiveness and simple ordering over bounded-buffer flow control.\n\n## How stream events surface as agent/session events\n\n`agentLoop.streamAssistantResponse()` bridges `AssistantMessageEvent` to `AgentEvent`:\n\n- on `start`: pushes placeholder assistant message and emits `message_start`\n- on block events (`text_*`, `thinking_*`, `toolcall_*`): updates last assistant message, emits `message_update` with raw `assistantMessageEvent`\n- on terminal (`done`/`error`): resolves final message from `response.result()`, emits `message_end`\n\n`AgentSession` then consumes those events for session-level behaviors:\n\n- TTSR watches `message_update.assistantMessageEvent` for `text_delta`, `thinking_delta`, and `toolcall_delta`\n- streaming edit guard inspects `toolcall_delta`/`toolcall_end` on `edit` calls and can abort early\n- persistence writes finalized messages at `message_end`\n- auto-retry examines assistant `stopReason === \"error\"` plus `errorMessage` heuristics\n\n## Unified vs provider-specific responsibilities\n\nUnified (common contract):\n\n- event shape (`AssistantMessageEvent`)\n- final result extraction (`done`/`error`)\n- delta throttling + merge rules\n- agent/session event propagation model\n\nProvider-specific (not fully abstracted):\n\n- upstream event taxonomies and mapping logic\n- stop-reason translation tables\n- tool-call ID conventions\n- reasoning/thinking block semantics and signatures\n- usage token semantics and availability timing\n- message conversion constraints per API\n\n## Implementation files\n\n- [`../../ai/src/stream.ts`](../packages/ai/src/stream.ts) — provider dispatch, option mapping, API key/session plumbing, custom API dispatch, and provider-specific credential handling.\n- [`../../ai/src/utils/event-stream.ts`](../packages/ai/src/utils/event-stream.ts) — generic stream queue + assistant delta throttling.\n- [`../../ai/src/utils/json-parse.ts`](../packages/ai/src/utils/json-parse.ts) — partial JSON parsing for streamed tool arguments.\n- [`../../ai/src/providers/anthropic.ts`](../packages/ai/src/providers/anthropic.ts) — Anthropic event translation and tool JSON delta accumulation.\n- [`../../ai/src/providers/openai-responses.ts`](../packages/ai/src/providers/openai-responses.ts), [`openai-codex-responses.ts`](../packages/ai/src/providers/openai-codex-responses.ts), [`azure-openai-responses.ts`](../packages/ai/src/providers/azure-openai-responses.ts) — Responses-family event translation and status mapping.\n- [`../../ai/src/providers/google.ts`](../packages/ai/src/providers/google.ts), [`google-gemini-cli.ts`](../packages/ai/src/providers/google-gemini-cli.ts), [`google-vertex.ts`](../packages/ai/src/providers/google-vertex.ts) — Gemini stream chunk-to-block translation variants.\n- [`../../ai/src/providers/google-shared.ts`](../packages/ai/src/providers/google-shared.ts) — Gemini finish-reason mapping and shared conversion rules.\n- [`../../ai/src/providers/amazon-bedrock.ts`](../packages/ai/src/providers/amazon-bedrock.ts), [`openai-completions.ts`](../packages/ai/src/providers/openai-completions.ts), [`ollama.ts`](../packages/ai/src/providers/ollama.ts), [`cursor.ts`](../packages/ai/src/providers/cursor.ts) — additional built-in stream adapters using the same event contract.\n- [`../../agent/src/agent-loop.ts`](../packages/agent/src/agent-loop.ts) — provider stream consumption and `message_update` bridging.\n- [`../src/session/agent-session.ts`](../packages/coding-agent/src/session/agent-session.ts) — session-level handling of streaming updates, abort, retry, and persistence.\n", "python-repl.md": "# Eval Tool Python Backend\n\nThis document describes the Python execution stack in `packages/coding-agent`.\nIt covers tool behavior, runner lifecycle, environment handling, execution semantics, output rendering, supported magics, and operational failure modes.\n\n## Scope and Key Files\n\n- Tool surface: `src/tools/eval.ts`\n- Session/per-call kernel orchestration: `src/eval/py/executor.ts`\n- Subprocess kernel client: `src/eval/py/kernel.ts`\n- Python wrapper / NDJSON server: `src/eval/py/runner.py`\n- Prelude helpers loaded into every kernel: `src/eval/py/prelude.py`\n- MIME bundle renderer (text + structured outputs): `src/eval/py/display.ts`\n- Interactive-mode renderer for user-triggered Python runs: `src/modes/components/eval-execution.ts`\n- Runtime/env filtering and Python resolution: `src/eval/py/runtime.ts`\n\n## What eval's Python backend is\n\nThe `eval` tool executes one or more Python cells inside a long-lived `python3` subprocess that speaks NDJSON over stdin/stdout. No Jupyter, no kernel gateway, no extra pip dependencies — a vanilla Python 3.8+ interpreter is enough. Rich `display()` output (PIL, pandas, plotly, matplotlib figures) keeps working because the wrapper reimplements the MIME-bundle dispatch that IPython previously provided.\n\nTool params:\n\n```ts\n{\n cells: Array<{ code: string; title?: string }>;\n timeout?: number; // seconds, clamped to 1..600, default 30\n reset?: boolean; // reset selected runtime before the first cell only\n}\n```\n\nThe tool is `concurrency = \"exclusive\"` for a session, so calls do not overlap.\n\n## Kernel lifecycle\n\nEach kernel is a single Python subprocess: `python -u `. The runner is bundled with the host binary (Bun text import), written to `~/.omp/python-env`-adjacent tmp cache once per script-hash, and reused by every subsequent spawn.\n\nKernel startup sequence:\n\n1. Availability check (`checkPythonKernelAvailability`) — verifies that a Python interpreter resolves and runs.\n2. Spawn `python -u runner.py` with filtered env and `cwd`.\n3. Send an init request that runs `os.chdir(cwd)`, injects env entries, and adds `cwd` to `sys.path`.\n4. Execute `PYTHON_PRELUDE` (idempotent — only initializes once per process).\n\nKernel shutdown:\n\n- Send `{\"type\": \"exit\"}` over stdin.\n- Wait for process exit with `SHUTDOWN_GRACE_MS` budget.\n- Escalate to `SIGTERM` and finally `SIGKILL` if the process does not exit in time.\n\n## Wire protocol (NDJSON, host ↔ runner)\n\nOne JSON object per line, UTF-8, `\\n` terminated.\n\nHost → runner:\n\n```jsonc\n{\"id\": \"\", \"code\": \"\", \"silent\": false, \"storeHistory\": true}\n{\"type\": \"exit\"}\n```\n\nRunner → host:\n\n```jsonc\n{\"type\": \"started\", \"id\": \"\"}\n{\"type\": \"stdout\", \"id\": \"\", \"data\": \"...\"}\n{\"type\": \"stderr\", \"id\": \"\", \"data\": \"...\"}\n{\"type\": \"display\", \"id\": \"\", \"bundle\": {: }}\n{\"type\": \"result\", \"id\": \"\", \"bundle\": {: }}\n{\"type\": \"error\", \"id\": \"\", \"ename\": \"...\", \"evalue\": \"...\", \"traceback\": [\"...\"]}\n{\"type\": \"done\", \"id\": \"\", \"status\": \"ok\"|\"error\", \"executionCount\": N, \"cancelled\": false}\n```\n\nStatus events the prelude emits (e.g. `_emit_status(\"find\", count=…)`) ship inside display bundles under `application/x-omp-status` so the existing TUI status renderer keeps working.\n\n## Magics\n\nThe runner's source transformer rewrites IPython-style magics to plain Python calls before parsing. Supported set:\n\n| Magic | Effect |\n| --- | --- |\n| `%pip ` | `python -m pip ` with live streaming output. Newly installed packages are evicted from `sys.modules` so the next `import` picks up the fresh install. |\n| `%cd ` | `os.chdir(path)` (with `~` expansion); emits status event. |\n| `%pwd` | Returns `os.getcwd()`. |\n| `%ls [path]` | Returns `sorted(os.listdir(path))`. |\n| `%env [KEY[=VAL]]` | List, read, or set env vars (matches prelude `env()` semantics). |\n| `%set_env KEY VALUE` | Set `os.environ[KEY]`. |\n| `%time ` / `%timeit ` | Time the expression; emits status event with elapsed ms. |\n| `%who` / `%whos` | List user-namespace names. |\n| `%reset` | Clear user globals and re-inject prelude. |\n| `%load ` | Read a file into a fresh cell and execute. |\n| `%run ` | `runpy.run_path` and merge globals back. |\n| `%%bash` / `%%sh` | Run the cell body via `bash`/`sh`. |\n| `%%capture [name]` | Run body with stdout/stderr captured into `name`. |\n| `%%timeit` | Time the cell body. |\n| `%%writefile ` | Write body to file. |\n| `!cmd` / `var = !cmd` | Run command via subprocess shell; returns an SList-style result with `.n` / `.s` helpers. |\n| `var = %name args` | Assignment forms work for line magics and `!cmd`. |\n\nUnknown magic names raise `NameError: UsageError: ...` inside the cell.\n\n## Session persistence semantics\n\n`python.kernelMode` controls retained kernel reuse:\n\n- `session` (default)\n - Reuses kernel sessions keyed by session file plus cwd when a session file exists; otherwise by cwd.\n - Execution is serialized per session via a queue.\n - Idle sessions are evicted after 5 minutes.\n - At most 4 sessions; oldest is evicted on overflow.\n - Heartbeat checks detect dead kernels.\n - Auto-restart allowed once; repeated crash ⇒ hard failure.\n- `per-call`\n - Spawns a fresh subprocess for each request.\n - Shuts the subprocess down after the request.\n - No cross-call state persistence.\n\n### Multi-cell behavior in a single tool call\n\nCells run sequentially in the same kernel instance for that tool call.\n\nIf an intermediate cell fails:\n\n- Earlier cell state remains in memory.\n- Tool returns a targeted error indicating which cell failed.\n- Later cells are not executed.\n\n`reset=true` only applies to the first cell execution in that call.\n\n## Environment filtering and runtime resolution\n\nEnvironment is filtered before launching the runner:\n\n- Allowlist includes core vars like `PATH`, `HOME`, locale vars, `VIRTUAL_ENV`, `PYTHONPATH`, etc.\n- Allow-prefixes: `LC_`, `XDG_`, `PI_`\n- Denylist strips common API keys (OpenAI/Anthropic/Gemini/etc.)\n\nRuntime selection order:\n\n1. Active/located venv (`VIRTUAL_ENV`, then `/.venv`, `/venv`)\n2. Managed venv at `~/.omp/python-env`\n3. `python` or `python3` on PATH\n\nWhen a venv is selected, its bin/Scripts path is prepended to `PATH`.\n\nThe runner additionally receives `PYTHONUNBUFFERED=1` and `PYTHONIOENCODING=utf-8` so streamed output reaches the host promptly.\n\n## Tool availability and mode selection\n\n`eval.py` / `eval.js` (both default `true`) plus optional `PI_PY` override controls eval backend exposure:\n\n- Python backend only (`eval.py=true`, `eval.js=false`)\n- JavaScript backend only (`eval.py=false`, `eval.js=true`)\n- both backends\n\n`PI_PY` accepted values:\n\n- `0` / `bash` → JavaScript backend only\n- `1` / `py` → Python backend only\n- `mix` / `both` → both backends\n\nIf Python preflight fails and `eval.js` is enabled, `eval` remains available and dispatches to JavaScript unless `language: \"python\"` is explicitly requested.\n\n## Execution flow and cancellation/timeout\n\n### Tool-level timeout\n\n`eval` timeout is in seconds, default 30, clamped to `1..600`. The tool combines caller abort signal and timeout signal with `AbortSignal.any(...)`.\n\n### Kernel execution cancellation\n\nOn abort/timeout:\n\n- The host sends `kill(\"SIGINT\")` to the runner subprocess.\n- The runner's exec-time signal handler raises `KeyboardInterrupt` inside the user code.\n- Result includes `cancelled=true`; timeout path annotates output as `Command timed out after seconds`.\n- Between requests the runner installs `SIG_IGN` for SIGINT so a stray cancel does not tear down the kernel.\n\nIf a second cancel is required (runner stuck in C code), the host escalates to `SIGTERM` and the session restarts on the next call.\n\n### stdin behavior\n\nInteractive stdin is not supported. The runner does not forward `input()` prompts; user code that calls `input()` blocks until cancellation.\n\n## Output capture and rendering\n\n### Captured output classes\n\nFrom runner frames:\n\n- `stdout` / `stderr` → plain text chunks\n- `display` / `result` → rich display handling (MIME bundle)\n- `error` → traceback text\n- `application/x-omp-status` MIME inside `display` → structured status events\n\nDisplay MIME precedence:\n\n1. `text/markdown`\n2. `text/plain`\n3. `text/html` (converted to basic markdown)\n\nAdditionally captured as structured outputs:\n\n- `application/json` → JSON tree data\n- `image/png` / `image/jpeg` → image payloads\n- `application/x-omp-status` → status events\n\n### Matplotlib\n\nThe runner sets `MPLBACKEND=Agg` as an environ default so figures render off-screen. After every cell, `pyplot.get_fignums()` is iterated; each figure is saved to PNG, emitted as an `image/png` display, and closed.\n\n### Storage and truncation\n\nOutput is streamed through `OutputSink` and may be persisted to artifact storage. Tool results can include truncation metadata and `artifact://` for full output recovery.\n\n### Renderer behavior\n\n- Tool renderer (`eval.ts`):\n - shows code-cell blocks with per-cell status\n - collapsed preview defaults to 10 lines\n - supports expanded mode for full output and richer status detail\n- Interactive renderer (`eval-execution.ts`):\n - used for user-triggered Python execution in TUI\n - collapsed preview defaults to 20 lines\n - clamps very long individual lines to 4000 chars for display safety\n - shows cancellation/error/truncation notices\n\n## Operational troubleshooting\n\n- **Python backend not available** — Check `eval.py`, `PI_PY`, and that `python`/`python3` is on PATH. If preflight fails and `eval.js` is enabled, omit `language` or pass `language: \"js\"` to use JavaScript.\n- **No Python on PATH** — Install a system Python 3.8+ or place a venv at `~/.omp/python-env`. `omp setup python --check` reports the resolved interpreter.\n- **Execution hangs then times out** — Increase tool `timeout` (max 600s) if workload is legitimate. For stuck native code, cancellation triggers `SIGINT` first then escalates; the session restarts on the next request.\n- **stdin/input prompts in Python code** — `input()` is not supported; pass data programmatically.\n- **Working directory errors** — Tool validates `cwd` exists and is a directory before execution.\n\n## Relevant environment variables\n\n- `PI_PY` — tool exposure override\n- `PI_PYTHON_SKIP_CHECK=1` — bypass Python preflight/warm checks\n- `PI_PYTHON_INTEGRATION=1` — enable gated integration tests that spawn a real Python\n- `PI_PYTHON_IPC_TRACE=1` — log NDJSON frames exchanged with the runner subprocess\n", "render-mermaid.md": "# RenderMermaid\n\n`RenderMermaid` is an optional built-in tool that renders Mermaid source to terminal-friendly text.\n\n## Enable it\n\nDisabled by default. Turn it on in `/settings` under **Tools → Render Mermaid**, or in `~/.omp/agent/config.yml`:\n\n```yaml\nrenderMermaid:\n enabled: true\n```\n\n## What it does\n\n- Tool name: `render_mermaid`\n- Input: Mermaid source in the required `mermaid` field\n- Output: rendered ASCII/Unicode text, not SVG or PNG\n- Storage: when artifact storage is available, the full render is also saved as an `artifact://...`\n\nThere are no model-specific or environment-variable prerequisites. Once enabled, any model that can call built-in tools can use it.\n\n## Parameters\n\n```json\n{\n \"mermaid\": \"graph TD\\n A[Start] --> B[Stop]\",\n \"config\": {\n \"useAscii\": false,\n \"paddingX\": 2,\n \"paddingY\": 2,\n \"boxBorderPadding\": 0\n }\n}\n```\n\nAvailable `config` fields:\n\n- `useAscii` — `true` for plain ASCII, `false` for Unicode box-drawing characters (default and usually more readable)\n- `paddingX` — horizontal spacing between nodes\n- `paddingY` — vertical spacing between nodes\n- `boxBorderPadding` — inner padding inside node boxes\n\n## Current limitations\n\n`RenderMermaid` uses the `beautiful-mermaid` ASCII renderer. It works best for flowcharts and small diagrams.\n\nComplex sequence diagrams, especially with `alt` / `else` blocks, can become very wide in a terminal. That is current renderer behavior, not a provider or model configuration problem.\n\nIf a sequence diagram is hard to read:\n\n1. Keep Unicode output (`useAscii: false`)\n2. Reduce spacing with a tighter config such as `paddingX: 2`, `paddingY: 2`, `boxBorderPadding: 0`\n3. Prefer smaller sub-diagrams over one large sequence diagram\n4. Open the saved artifact if the inline preview is truncated in the TUI\n\n## Example\n\nInput:\n\n```mermaid\ngraph TD\n A[Start] --> B{Decision}\n B -->|Yes| C[Action]\n B -->|No| D[End]\n```\n\nTypical result:\n\n```text\n┌─────┐\n│Start│\n└─────┘\n │\n ▼\n┌────────┐\n│Decision│\n└────────┘\n```\n", "resolve-tool-runtime.md": "# Resolve tool runtime internals\n\nThis document explains how preview/apply workflows are modeled in coding-agent and how built-in or custom tools can participate via the tool-choice queue and `pushPendingAction`.\n\n## Scope and key files\n\n- [`src/tools/resolve.ts`](../packages/coding-agent/src/tools/resolve.ts)\n- [`src/tools/ast-edit.ts`](../packages/coding-agent/src/tools/ast-edit.ts)\n- [`src/extensibility/custom-tools/types.ts`](../packages/coding-agent/src/extensibility/custom-tools/types.ts)\n- [`src/extensibility/custom-tools/loader.ts`](../packages/coding-agent/src/extensibility/custom-tools/loader.ts)\n- [`src/sdk.ts`](../packages/coding-agent/src/sdk.ts)\n\n## What `resolve` does\n\n`resolve` is a hidden tool that finalizes a pending preview action.\n\n- `action: \"apply\"` executes the queued action's `apply(reason)` callback and returns that result with resolve metadata.\n- `action: \"discard\"` invokes `reject(reason)` if provided; otherwise returns `Discarded: