# 🤖 Alvin Bot — Autonomous AI Agent

> Your personal AI agent — on Telegram, WhatsApp, Discord, Slack, Signal, Terminal, and Web.

Alvin Bot is an open-source, MIT-licensed, self-hosted autonomous AI agent that runs on your own machine and answers you on Telegram, Slack, Discord, WhatsApp, Signal, a terminal TUI, and a web dashboard. It is built on the official Claude Agent SDK and runs a provider-agnostic engine that also drives OpenAI, Groq, Google Gemini, NVIDIA NIM, OpenRouter, and Ollama, with automatic failover after two consecutive provider failures and a heartbeat health check every five minutes. Unlike most personal AI agents, it ships a zero-config indexed memory store: with no embedding API key it falls back to a built-in SQLite FTS5 keyword index, so recall works out of the box. It dispatches detached sub-agents as independent `claude -p` subprocesses that keep running and deliver their result even if the parent conversation is aborted. It is local-first and telemetry-free — prompts and responses are never logged off-machine, secrets live in a chmod-0600 `.env`, and shell execution is allowlisted by default.

> **What's new — v5.44.1 (June 2026):** Laptops no longer restart the bot after waking from sleep — the crash-backstop now tells a sleep gap apart from a real freeze (by how long the machine has actually been awake), so a healthy bot is left alone while a genuinely stuck one is still recovered. Builds on the v5.33 / v5.36 sleep-hardening series. v5.44.0 added a clear `/provider` (pick the AI service) vs `/model` (pick the model within it) hierarchy, applied instantly. [Full changelog →](CHANGELOG.md)

---

## ✨ Features

### 🧠 Intelligence
- **Multi-model engine** — Claude Agent SDK · OpenAI · Groq · NVIDIA NIM · Gemini · OpenRouter · Ollama · Codex CLI · any OpenAI-compatible API
- **Automatic fallback + heartbeat monitor** — pings providers every 5 min, auto-failover after 2 failures, auto-recovery; reorder priority via Telegram `/fallback`, Web UI, or API
- **Adjustable thinking depth** — `/effort low` to `/effort max`
- **Pluggable memory backends (v4.22)** — Gemini · OpenAI · Ollama · FTS5 keyword fallback. Auto-detection picks the best available. Indexed search across `MEMORY.md`, daily logs, project files, hub memory, asset index. Override via `EMBEDDINGS_PROVIDER`.
- **Smart system-prompt injection (v4.22)** — once SQLite is populated, stops bulk-injecting `MEMORY.md` and surfaces only the chunks relevant to the user's current message. Cuts ~25 k tokens per turn for typical setups. `MEMORY_INJECT_MODE=auto|legacy|sqlite` to override.
- **Layered memory (L0–L3)** — `identity.md` + `preferences.md` always plain-text · project memories on topic match · daily logs / curated knowledge via semantic or keyword search
- **Persistent sessions** — Claude SDK resume tokens, conversation history, language, effort survive bot restarts
- **Multi-session workspaces** — parallel context-isolated sessions per Slack channel or `/workspace` switch, each with its own cwd, purpose, persona. Memory + skills stay globally shared. [How-to ↓](#-multi-session-workspaces-v4120)
- **Detached sub-agents** — `alvin_dispatch_agent` MCP tool spawns independent `claude -p` subprocesses that survive parent aborts. Results deliver as separate messages. Works identically on Telegram / Slack / Discord / WhatsApp.
- **Smart tool discovery** — scans your system at startup; typical install surfaces 30–70 tools depending on what's locally available
- **Skill system** — 14 SKILL.md files (see [Skills ↓](#-skills)) auto-activate based on message context
- **Self-awareness + auto-language** — knows it IS the AI · detects EN/DE/ES/FR and adapts; learns preference over time

### 💬 Multi-Platform
- **Telegram** — streaming, inline keyboards, voice, photos, documents
- **Slack** — Socket Mode via `@slack/bolt`, DMs + @mentions, file attachments, `assistant.threads.setStatus` typing. **One channel = one isolated workspace.**
- **WhatsApp** — via WhatsApp Web; self-chat as AI notepad, group whitelist with per-contact access, full media. Owner approval gate routes to Telegram (DM / Discord / Signal fallback) before the bot replies.
- **Discord** — server bot with mention/reply detection and slash commands
- **Signal** — via signal-cli REST API, voice transcription
- **Terminal** — rich TUI with ANSI colors + streaming (`alvin-bot tui`)
- **Web UI** — full dashboard, chat, settings, file manager, terminal, workspace overview

### 🔧 Capabilities
- **Tool layer** — Shell · files · Python · git · email · PDF · media · vision · screenshots · system control. Universal tool use across any provider that supports function calling; text-only fallback for those that don't.
- **6 built-in plugins** — weather · finance · notes · calendar · email · smarthome
- **MCP client** — connect any Model Context Protocol server
- **Cron** — AI-driven scheduled tasks (`"check my email every morning"`)
- **Voice** — STT via Groq Whisper, TTS via Edge TTS or ElevenLabs
- **Vision + image generation** — photo / document analysis · Gemini / DALL·E generation with API key
- **Browser** — 4-tier strategy: WebFetch · stealth Playwright · CDP with persistent profile · agent-browser CLI (Tier-1.5, opt-in)

### 🖥️ Web Dashboard
- WebSocket streaming chat · model switcher · platform & provider setup · file manager · memory editor · session browser · in-browser terminal · maintenance + health · workspace cards with cost aggregation

---

## ⚖️ How Alvin Bot compares

Alvin Bot sits in the same category as **Hermes Agent** (Nous Research) and **OpenClaw** — self-hosted, open-source personal AI agents that live on your machine and reach you on the chat apps you already use. They optimize for different things. This table is intended to be fair: where Hermes or OpenClaw is the better tool, it says so.

| Dimension | **Alvin Bot** | **Hermes Agent** | **OpenClaw** |
|---|---|---|---|
| License / hosting | MIT · self-hosted · local-first · zero telemetry | MIT · self-hosted · 7 execution backends | Open-source · self-hosted · bring-your-own-key |
| Model providers | Claude Agent SDK + OpenAI · Groq · Gemini · NVIDIA NIM · OpenRouter · Ollama, with **automatic failover after 2 provider failures + a 5-min heartbeat monitor** | 200+ models | Bring-your-own model / key |
| Sub-agents | **Detached `claude -p` subprocesses that survive a parent abort**; `readonly`/`research` toolset presets | Isolated subagents for parallel workstreams | Not a primary focus |
| Browser automation | **4-tier escalation**: WebFetch → stealth Playwright → persistent-profile CDP → agent-browser CLI | Built-in browse / vision tools | Via tools |
| Platforms | Telegram · Slack · Discord · WhatsApp · Signal · terminal TUI · Web (7) | 20+ platforms from one gateway | 25–50+ platforms · native mobile apps · voice activation |
| Memory | Layered L0–L3; SQLite embeddings with a **zero-config FTS5 keyword fallback (works with no API key)**; smart prompt-injection trims ~25 k tokens/turn | SQLite + full-text search · agent-curated · Honcho user profiling | Transparent plain Markdown/YAML files you can grep and git-track |
| Extensibility | Hot-reload skills + 6 plugins · self-modifying skills · hooks · MCP client | 40+ built-in tools · **autonomous self-improving skill loop** | Skills as files · very large ecosystem |
| MCP | MCP **client** (connect any MCP server) | MCP client **and `hermes mcp serve`** (acts as an MCP server for Claude Desktop / Cursor / VS Code) | Tool integrations |
| Self-healing | **Startup preflight · dead-man's-switch heartbeat · crash forensic bundles · AI self-diagnosis · crash-loop brake · trend anomaly detection** | Stable in practice; self-improving | Frequent updates can break running instances |
| Security defaults | Exec **allowlist + shell-metachar filter on by default** · DM pairing · timing-safe webhook auth · 0600 file perms enforced · `alvin-bot audit` CLI · honestly documented threat model | Standard | Standard |
| Maturity / community | Small, focused, single-maintainer; modest public adoption | Large community, Nous Research team | Large community + team, Nvidia NemoClaw fork |

### Use the right tool for the job

- **Use Alvin Bot when** you want one resilient, self-healing agent on your own box that keeps working when a provider rate-limits or fails, gives you indexed memory **without buying an embedding API key**, ships safe-by-default execution sandboxing, and is built directly on the official Claude Agent SDK — and you mainly live in Telegram / Slack / Discord / WhatsApp / Signal.
- **Use Hermes Agent when** you want a research-grade self-improving agent, need it to act as an **MCP server** for Claude Desktop / Cursor / VS Code, want 200+ model choice or many execution backends, and value a large community.
- **Use OpenClaw when** you want the **widest messaging reach** (25–50+ channels) plus native mobile apps and voice activation, fully transparent plain-file memory you can git-track, and the largest ecosystem.

A longer head-to-head with FAQ and decision guide: **[Alvin Bot vs Hermes vs OpenClaw](https://alvin.alev-b.com/vs/hermes-openclaw)**.

---

## 🚀 Quick Start

**Brand-new machine? One line is all you need** — even with nothing installed
(no Homebrew, no Xcode tools, no Node, no admin password):

```bash
curl -fsSL https://unpkg.com/alvin-bot/install.sh | bash
```

It reuses an existing Node if you have one, otherwise fetches a self-contained
Node into your home folder, installs Alvin into a user-owned location (never
`sudo`, nothing system-wide), and launches the setup wizard.

**Already have Node 18+?** The classic three commands work too:

```bash
npm install -g alvin-bot
alvin-bot setup
alvin-bot start
```

Either way, the setup wizard validates everything:
- ✅ Lets you pick your AI provider and tests the key
- ✅ Verifies your Telegram bot token
- ✅ Confirms the setup works before you start

**You'll need:** a Telegram bot token ([@BotFather](https://t.me/BotFather)) · your Telegram user ID ([@userinfobot](https://t.me/userinfobot)). Node.js 18+ ([nodejs.org](https://nodejs.org)) is auto-installed by the one-line script if missing.

> **WhatsApp is optional.** Telegram, Slack, Discord, terminal and the web UI work out of the box. The WhatsApp connector is an opt-in add-on (it needs a couple of extra packages) — enable it any time from the Web UI (Platforms → Install Dependencies); the bot shows you the exact one-line command when you first try to use it.

> **Native build note:** Alvin Bot uses `better-sqlite3` for indexed memory. Prebuilt binaries are included for common macOS and Linux environments so most installs need nothing extra. If your platform doesn't have a prebuilt binary and the optional native compilation is skipped, the bot still runs — semantic memory falls back gracefully to keyword search. A C++ toolchain (Xcode Command Line Tools on macOS, `build-essential` on Ubuntu) and Python 3 are only needed if you hit a build-from-source fallback.

Free AI providers available — no credit card needed. **Privacy-first?** Pick the 🔒 **Offline — Gemma 4 E4B** option in setup for a fully local LLM via Ollama (macOS/Linux: automated install; Windows: manual).

### 🔐 A note on permission prompts

The first time Alvin reaches for a new tool — a shell command, a file read, a web fetch — you may see a permission prompt from the underlying agent runtime asking whether to allow it. Those prompts come from Alvin himself, not from a third party. Approving one expands what he can do for you autonomously; denying keeps the scope narrow. The more you allow, the more capable and hands-off he becomes — you stay in control either way, and you can always revoke a permission later.

**macOS only — one extra step under launchd.** If you install Alvin as a background service (`alvin-bot launchd install`), macOS won't be able to show you those permission dialogs interactively anymore. To let the bot and anything it spawns (Codex CLI, file-reading skills) actually read your files, grant **Full Disk Access** to `node` once: System Settings → Privacy & Security → Full Disk Access → **+** → add `/opt/homebrew/Cellar/node/<version>/bin/node` (find the exact path with `readlink -f "$(which node)"`). `alvin-bot launchd install` and `alvin-bot doctor` will both detect and remind you with the exact path. After `brew upgrade node` you'll need to re-grant, because TCC binds to the versioned Cellar path. The printable [macOS Setup Guide PDF](https://github.com/alvbln/Alvin-Bot/releases/latest/download/Alvin-Bot-macOS-Setup-Guide.pdf) covers this end-to-end.

### 📘 First-time setup walkthroughs

Step-by-step printable PDF guides:

| Platform | PDF (printable) |
|---|---|
| 🍎 **macOS** (with `launchd` background service) | [Download PDF](https://github.com/alvbln/Alvin-Bot/releases/latest/download/Alvin-Bot-macOS-Setup-Guide.pdf) |
| 🪟 **Windows** (with Task Scheduler / Startup folder) | [Download PDF](https://github.com/alvbln/Alvin-Bot/releases/latest/download/Alvin-Bot-Windows-Setup-Guide.pdf) |

Both guides cover: Node.js install · Telegram bot creation · first-time `setup` · foreground test · background service · offline Gemma 4 mode · troubleshooting. ~15 min end-to-end for a first-time user.

### macOS: use `launchd` instead of pm2 (recommended)

If you're on macOS and using Claude Code (Max subscription) as your provider, run the bot as a **LaunchAgent** — it inherits the GUI login session so the macOS Keychain stays unlocked and the Claude OAuth token just works without any manual `security unlock-keychain` dance:

```bash
alvin-bot launchd install    # writes ~/Library/LaunchAgents/com.alvinbot.app.plist and starts the agent
alvin-bot launchd status     # show PID + recent stdout/stderr logs
alvin-bot launchd uninstall  # unload + remove the plist
```

Pm2 still works and remains the default on Linux/Windows — but on macOS with Claude Code, `launchd` is the only path that reliably keeps Keychain access over restarts.

### 📖 Handbook

For a full walkthrough of everything Alvin Bot can do — providers, sub-agents, cron jobs, plugins, MCP, security audit, web UI — read **[`docs/HANDBOOK.md`](docs/HANDBOOK.md)**.

### AI Providers

| Provider | Cost | Best for |
|----------|------|----------|
| **Groq** | Free | Getting started fast |
| **Google Gemini** | Free | Image understanding, embeddings |
| **NVIDIA NIM** | Free | Tool use, 150+ models |
| OpenAI | Paid | GPT-4o quality |
| OpenRouter | Paid | 100+ models marketplace |
| Claude SDK | Paid* | Full agent with tool use |

\*Claude SDK requires a [Claude Max](https://claude.ai) subscription ($20/mo) or Anthropic API access. The setup wizard checks this automatically.

### Alternative Installation

<details>
<summary>One-line install script (Linux/macOS) — mirror URL</summary>

The one-liner from Quick Start, served from a second CDN if the first is blocked:

```bash
curl -fsSL https://cdn.jsdelivr.net/npm/alvin-bot/install.sh | bash
```

Bootstraps Node if missing (self-contained, no `sudo`), installs Alvin into a
user-owned location, and runs the setup wizard automatically.
</details>

<details>
<summary>Desktop App (macOS)</summary>

| Platform | Download | Architecture |
|----------|----------|-------------|
| macOS | [DMG](https://github.com/alvbln/Alvin-Bot/releases/latest) | Apple Silicon (M1+) |
| Windows | Coming soon | x64 |
| Linux | Coming soon | x64 |

The desktop app auto-starts the bot and provides a system tray icon with quick controls.
</details>

<details>
<summary>Docker</summary>

```bash
git clone https://github.com/alvbln/Alvin-Bot.git
cd Alvin-Bot
cp .env.example .env    # Edit with your tokens
docker compose up -d
```

Note: Claude SDK is not compatible with Docker (requires interactive CLI login).
</details>

<details>
<summary>From Source (contributors)</summary>

```bash
git clone https://github.com/alvbln/Alvin-Bot.git
cd Alvin-Bot
npm install
npm run build
node bin/cli.js setup   # Interactive wizard
npm run dev             # Start in dev mode
```
</details>

<details>
<summary>Production (PM2)</summary>

```bash
npm install -g pm2
pm2 start ecosystem.config.cjs
pm2 save && pm2 startup
```
</details>

### Troubleshooting

```bash
alvin-bot doctor        # Check configuration & validate connections
```

If your AI provider isn't working, run `doctor` — it tests the actual API connection and shows exactly what's wrong.

---

## 📋 Commands

| Command | Description |
|---------|-------------|
| `/help` | Show all commands |
| `/start` | Session status overview |
| `/new` | Fresh conversation (reset context) |
| `/model` | Switch AI model (inline keyboard) |
| `/effort <low\|medium\|high\|max>` | Set thinking depth |
| `/voice` | Toggle voice replies |
| `/imagine <prompt>` | Generate images |
| `/web <query>` | Search the web |
| `/remind <time> <text>` | Set reminders (e.g., `/remind 30m Call mom`) |
| `/cron` | Manage scheduled tasks |
| `/recall <query>` | Search memory |
| `/remember <text>` | Save to memory |
| `/export` | Export conversation |
| `/dir <path>` | Change working directory |
| `/workspaces` | List all configured workspaces (v4.12.0) |
| `/workspace [name]` | Show or switch the active workspace — `/workspace default` resets (v4.12.0) |
| `/status` | Current session & cost info |
| `/setup` | Configure API keys & platforms |
| `/system <prompt>` | Set custom system prompt |
| `/fallback` | View & reorder provider fallback chain |
| `/skills` | List available skills & their triggers |
| `/lang <de\|en\|auto>` | Set or auto-detect response language |
| `/cancel` | Abort running request |
| `/reload` | Hot-reload personality (SOUL.md) |

---

## 🏗️ Architecture

```
  Telegram   Slack   WhatsApp   Discord   Signal   Web UI · TUI · CLI
      └─────────┴─────────┴────┬────┴─────────┴───────────┘
                               ▼
              Workspace Resolver  (per-channel cwd + persona)
                               ▼
                 Engine  (routing · fallback · heartbeat)
        ┌──────────────────────┼───────────────────────────┐
        ▼                      ▼                            ▼
   Claude SDK         OpenAI · Groq · Gemini ·      Ollama · Codex CLI ·
                      NVIDIA · OpenRouter           OpenAI-compatible
        │
        ├─ reads ▶  Memory Layer
        │              ├─ L0 / L1 — identity.md · preferences.md (always plain-text)
        │              └─ SQLite store — provider auto-detect (Gemini · OpenAI · Ollama · FTS5)
        │
        └─ dispatches ▶  Detached sub-agents  (independent `claude -p`, survive parent abort)
```

> Rendered as plain text so it displays identically on npm **and** GitHub
> (npm's README renderer does not support Mermaid diagrams).

### Provider matrix

| Provider | Tool use | Streaming | Vision | Auth |
|---|---|---|---|---|
| Claude SDK (Agent) | ✅ native (Bash, Read, Write, Web, MCP) | ✅ | ✅ | Claude CLI OAuth |
| OpenAI · Groq · Gemini · NVIDIA NIM · OpenRouter | ✅ universal tool use | ✅ | varies | API key |
| Ollama (local) | ✅ via tool-bridge | ✅ | varies | none |
| Codex CLI | ✅ subprocess | ✅ | — | Codex CLI auth |
| Any OpenAI-compatible | ⚡ auto-detect | ✅ | varies | API key |

> **Universal tool use** — Alvin gives full agent powers to any provider that supports function calling. Shell · files · Python · web work everywhere; providers without tool calls degrade cleanly to text-only chat.

### Project layout

```
src/
├── index.ts                 entry point
├── engine.ts                multi-model query engine
├── handlers/                message + command handlers
├── platforms/               Telegram · Slack · WhatsApp · Discord · Signal
├── providers/               Claude SDK · OpenAI-compat · Ollama · Codex CLI
├── services/
│   ├── embeddings/          v4.22 pluggable provider facade (Gemini/OpenAI/Ollama/FTS5)
│   ├── memory*.ts           layered memory (L0-L3) + inject-mode resolver
│   ├── workspaces.ts        per-channel cwd + persona registry
│   ├── alvin-dispatch.ts    detached sub-agent orchestration
│   ├── browser-manager.ts   4-tier browser strategy
│   └── …                    cron · voice · skills · MCP · hooks · …
├── tui/                     terminal chat UI
└── web/                     dashboard server + APIs
web/public/                  zero-build HTML/CSS/JS UI
plugins/                     6 built-in plugins (hot-reload)
skills/                      14 SKILL.md files (hot-reload)
bin/cli.js                   CLI entry point
electron/                    Electron wrapper for the .dmg build
```

---

## 🧭 Multi-Session Workspaces (v4.12.0)

**Run multiple parallel Alvin sessions on the same bot — one per project, context-isolated, memory shared.** Think Claude Coworker, but on your own machine with your own tools. Each workspace has its own working directory, purpose, and optional persona. Sub-agents spawned in one workspace stay in that workspace. Memory, skills, and the knowledge base are globally shared across all of them.

### Why you'd want this

Without workspaces, Alvin has one big blob of context. If you ask about one project's deployment right after debugging a completely unrelated service, Claude pollutes one context with the other. Workspaces solve this: **Slack channel = session**, or on Telegram, **`/workspace my-project` = session**. Each one has its own Claude SDK `resume` token, history, and current project CLAUDE.md loaded via its working directory.

### How it works

1. **Drop a markdown file** into `~/.alvin-bot/workspaces/<name>.md` with YAML frontmatter.
2. **Alvin hot-reloads** the workspace registry (no restart needed — same pattern as skills).
3. On **Slack**, workspaces resolve by explicit channel ID first, then by channel name match (`#my-project` → `workspaces/my-project.md`, case-insensitive).
4. On **Telegram**, run `/workspace <name>` to switch — next message uses the new persona and cwd.
5. Nothing configured? Alvin falls back to the "default" workspace exactly like pre-v4.12 — **no breaking changes**.

### Example workspace file

Create `~/.alvin-bot/workspaces/my-project.md`:

```markdown
---
purpose: my-project website dev
cwd: ~/Projects/my-project
emoji: "🏢"
color: "#6366f1"
channels: ["C01ABCDEF"]
---
You are focused on the my-project website. Stack: React + Express +
Drizzle + MySQL. Production VPS at your-vps.example.com, deploy via rsync.
Prefer concise, directly actionable answers about features, deployment,
and Stripe integration.
```

The `cwd` auto-loads the project-specific `CLAUDE.md` via Claude SDK's `settingSources: ["user", "project"]`, so each workspace inherits its project's conventions automatically. `channels` is optional — omit it to match by filename.

### Slack setup (5 minutes)

1. Download the setup guide + manifest from the [latest release](https://github.com/alvbln/Alvin-Bot/releases/latest):
   - `slack-setup.md` — step-by-step instructions
   - `slack-manifest.json` — copy-paste ready Slack App manifest
2. Create a Slack App from the manifest at https://api.slack.com/apps → **Create New App** → **From an app manifest**
3. Enable Socket Mode, generate an **App-Level Token** (starts with `xapp-`)
4. Install the app to your workspace, copy the **Bot User OAuth Token** (starts with `xoxb-`)
5. Add both to `~/.alvin-bot/.env`:
   ```bash
   SLACK_APP_TOKEN=xapp-1-...
   SLACK_BOT_TOKEN=xoxb-...
   SLACK_ALLOWED_USERS=U01ABCDEF      # optional, comma-separated
   ```
6. Restart Alvin. You should see `💬 Slack connected (Alvin @ YourWorkspace)` in the log.
7. Invite Alvin to channels with `/invite @Alvin`. DMs work without an invite.

### Telegram `/workspace` commands

| Command | Effect |
|---|---|
| `/workspaces` | List all configured workspaces with emojis and purposes (active one marked ✅) |
| `/workspace` | Show the currently active workspace |
| `/workspace <name>` | Switch to `<name>` — next message uses its persona and cwd |
| `/workspace default` | Reset to the default workspace (global cwd, no persona) |

Workspace selection is per Telegram user, persisted across bot restarts via `~/.alvin-bot/state/sessions.json` (v2 envelope format, backwards compatible with v4.11).

### Web UI

The dashboard has a dedicated **🧭 Workspaces** tab (Data section in the sidebar). Each workspace shows as a color-coded card with emoji, purpose, cwd, mapped channels, session count, message count, and cumulative cost. Useful for spotting which project is burning the most tokens.

Or query directly:

```bash
curl -s http://localhost:3100/api/workspaces | jq
```

### Architecture guarantees

- **Memory is global.** Facts Alvin learns in one workspace are visible in every other workspace via the shared `MEMORY.md` and embeddings index. Per-workspace memory layer is on the v4.13 roadmap.
- **Sub-agents are per-session.** Each workspace can dispatch its own detached sub-agents via `alvin_dispatch_agent` — results come back to the originating channel on any platform (Telegram, Slack, Discord, WhatsApp), visible in `/subagents list` (v4.13.0+ dispatch, v4.14.0 cross-platform, v4.14.1 unified list view).
- **Session state survives restart.** Claude SDK `resume` tokens, conversation history, language, effort, and `workspaceName` all persist via `session-persistence.ts` (v4.11.0).
- **Backwards compatible.** If you don't create any workspace files, everything behaves exactly like v4.11. Upgrade is a no-op.

---

## ⚙️ Configuration

### Environment Variables

```env
# Required
BOT_TOKEN=<Telegram Bot Token>
ALLOWED_USERS=<comma-separated Telegram user IDs>

# AI Providers (at least one needed)
# Claude SDK uses CLI auth — no key needed
GROQ_API_KEY=<key>              # Groq (voice + fast models)
NVIDIA_API_KEY=<key>            # NVIDIA NIM models
GOOGLE_API_KEY=<key>            # Gemini + image generation
OPENAI_API_KEY=<key>            # OpenAI models
OPENROUTER_API_KEY=<key>        # OpenRouter (100+ models)

# Provider Selection
PRIMARY_PROVIDER=claude-sdk     # Primary AI provider
FALLBACK_PROVIDERS=nvidia-kimi-k2.5,nvidia-llama-3.3-70b

# Memory backend (v4.22+) — auto-detects based on what keys you have.
# Set to override the default priority: gemini → openai → ollama → fts5.
# fts5 is the zero-config keyword fallback — no key needed, works for everyone.
EMBEDDINGS_PROVIDER=auto                  # auto | gemini | openai | ollama | fts5
OLLAMA_EMBEDDING_MODEL=nomic-embed-text   # only used for ollama provider
MEMORY_INJECT_MODE=auto                   # auto | legacy | sqlite (see CHANGELOG v4.22)

# Optional Platforms
WHATSAPP_ENABLED=true           # Enable WhatsApp (needs Chrome)
DISCORD_TOKEN=<token>           # Enable Discord
SIGNAL_API_URL=<url>            # Signal REST API URL
SIGNAL_NUMBER=<number>          # Signal phone number
SLACK_BOT_TOKEN=xoxb-...        # Slack Bot User OAuth Token (Socket Mode)
SLACK_APP_TOKEN=xapp-1-...      # Slack App-Level Token (connections:write scope)
SLACK_ALLOWED_USERS=U01...      # Optional: comma-separated Slack user IDs allowlist

# Multi-Session (v4.12.0)
SESSION_MODE=per-channel        # per-user (default) | per-channel | per-channel-peer
                                # per-channel gives each Slack channel / group its own isolated session

# Optional
WORKING_DIR=~                   # Default working directory (used when no workspace is resolved)
MAX_BUDGET_USD=5.0              # Cost limit per session
WEB_PORT=3100                   # Web UI port
WEB_PASSWORD=<password>         # Web UI auth (optional)
CHROME_PATH=/path/to/chrome     # Custom Chrome path (for WhatsApp)
MEMORY_EXTRACTION_DISABLED=1    # Opt out of v4.11.0 auto-fact-extraction in compaction
```

### Custom Models

Add any OpenAI-compatible model via `docs/custom-models.json`:

```json
[
  {
    "key": "my-local-llama",
    "name": "Local Llama 3",
    "model": "llama-3",
    "baseUrl": "http://localhost:11434/v1",
    "apiKeyEnv": "OLLAMA_API_KEY",
    "supportsVision": false,
    "supportsStreaming": true
  }
]
```

### Personality

Edit `SOUL.md` to customize the bot's personality. Changes apply on `/reload` or bot restart.

### WhatsApp Setup

WhatsApp uses [whatsapp-web.js](https://github.com/nicholascui/whatsapp-web.js) — the bot runs as **your own WhatsApp account** (not a separate business account). Chrome/Chromium is required.

**1. Enable WhatsApp**

Set `WHATSAPP_ENABLED=true` in `.env` (or toggle via Web UI → Platforms → WhatsApp). Restart the bot.

**2. Scan QR Code**

On first start, a QR code appears in the terminal (and in the Web UI). Scan it with WhatsApp on your phone (Settings → Linked Devices → Link a Device). The session persists across restarts.

**3. Chat Modes**

| Mode | Env Variable | Description |
|------|-------------|-------------|
| **Self-Chat** | *(always on)* | Send yourself messages → bot responds. Your AI notepad. |
| **Groups** | `WHATSAPP_ALLOW_GROUPS=true` | Bot responds in whitelisted groups. |
| **DMs** | `WHATSAPP_ALLOW_DMS=true` | Bot responds to private messages from others. |
| **Self-Chat Only** | `WHATSAPP_SELF_CHAT_ONLY=true` | Disables groups and DMs — only self-chat works. |

All toggles are also available in the Web UI (Platforms → WhatsApp). Changes apply instantly — no restart needed.

**4. Group Whitelist**

Groups must be explicitly enabled. In the Web UI → Platforms → WhatsApp → Group Management:

- **Enable** a group to let the bot listen
- **Allowed Contacts** — Select who can trigger the bot (empty = everyone)
- **@ Mention Required** — Bot only responds when mentioned (voice/media bypass this)
- **Process Media** — Allow photos, documents, audio, video
- **Approval Required** — Owner must approve each message via Telegram before the bot responds. Group members see nothing — completely transparent.

> **Note:** Your own messages in groups are never processed (you ARE the bot on WhatsApp). The bot only responds to other participants. In self-chat, your messages are always processed normally.

**5. Approval Flow** (when enabled per group)

1. Someone writes in a whitelisted group
2. You get a Telegram notification with the message preview + ✅ Approve / ❌ Deny buttons
3. Approve → bot processes and responds in WhatsApp. Deny → silently dropped.
4. Fallback channels if Telegram is unavailable: WhatsApp self-chat → Discord → Signal
5. Unapproved messages expire after 30 minutes.

---

## 🔌 Plugins

Built-in plugins in `plugins/`:

| Plugin | Description |
|--------|-------------|
| weather | Current weather & forecasts |
| finance | Stock prices & crypto |
| notes | Personal note-taking |
| calendar | Calendar integration |
| email | Email management |
| smarthome | Smart home control |

Plugins are auto-loaded at startup. Create your own by adding a directory with an `index.js` exporting a `PluginDefinition`.

---

## 🎯 Skills

Skills are markdown files in `skills/` that auto-activate when the user's message matches their trigger keywords. The skill body gets injected into the system prompt, giving the agent specialized expertise on demand. 14 ship built-in:

| Skill | Description |
|---|---|
| **agent-browser** | Token-efficient web automation via the agent-browser CLI (accessibility-tree snapshots) — Tier 1.5 of the browser stack |
| **apple-notes** | Read, create, search Apple Notes via AppleScript (macOS) |
| **browse** | 3-tier browser control: WebFetch · stealth Playwright · CDP with persistent profile |
| **code-project** | Software development workflows: build, debug, refactor, architecture patterns |
| **data-analysis** | CSV / JSON / Excel processing, charts, statistics via Python |
| **document-creation** | Professional PDFs, reports, letters with formatting |
| **email-summary** | Inbox triage, newsletter digests, priority sorting |
| **github** | Issues, PRs, releases, workflows via the `gh` CLI |
| **social-fetch** | Analyse Instagram / TikTok / YouTube / X URLs the user shares |
| **summarize** | Condense URLs, PDFs, long documents |
| **system-admin** | Server management, deploys, Docker, nginx, SSL |
| **weather** | Forecasts and conditions |
| **web-research** | Deep multi-source research with citation aggregation |
| **webcheck** | Security / SEO audit of a website |

Drop your own `<name>/SKILL.md` into `~/.alvin-bot/skills/` for hot-reload. List active skills via `/skills` or `alvin-bot skills`.

---

## 🛠️ CLI

### Core lifecycle

```bash
alvin-bot setup           # Interactive setup wizard (Telegram + AI provider + tools)
alvin-bot start           # Start the bot in background (launchd on macOS, pm2 elsewhere)
alvin-bot start -f        # Start in foreground (for debugging)
alvin-bot stop            # Stop the running bot
alvin-bot status          # Show version + LaunchAgent / pm2 state (offline)
alvin-bot doctor          # Health check — config, provider, memory, permissions
alvin-bot update          # Pull latest from npm (or git if running from source)
alvin-bot version         # Show version
```

### Interactive chat

```bash
alvin-bot tui             # Terminal chat UI with streaming + ANSI colors ✨
alvin-bot chat            # Alias for tui
alvin-bot tui --lang de   # Force German UI
```

### AI provider management (since 4.24.0)

Switch between Claude SDK / Codex CLI / Groq / Gemini / OpenAI / OpenRouter / NVIDIA NIM / offline Gemma 4 without re-running the full setup wizard. The switch command runs the same install + auth flow the wizard uses (CLI install + OAuth login for `claude-sdk` / `codex-cli`, API-key prompt + live validation for the rest), then does a byte-preserving merge of `~/.alvin-bot/.env` — the previous provider's API key is **parked, not deleted**, so rollback is one un-comment away.

```bash
alvin-bot provider list                # Show all providers + per-provider install/key status
alvin-bot provider show                # Detailed info on the currently configured provider
alvin-bot provider switch <key>        # Switch (interactive setup + .env merge + bot restart)
alvin-bot provider doctor              # Validate current provider's auth against its API
```

`<key>` accepts canonical slugs **or** short aliases: `claude`, `codex`, `gemini`, `nvidia`, `gpt`, `gemma`.

### Optional tools — install / update (since 4.23.0)

A curated set of universally useful CLIs that unlock specific skills. Bootstrap tools (`yt-dlp`, `ffmpeg`, and `wacli` if WhatsApp is enabled) are auto-installed/updated by `setup` and `update`; the rest you opt into through the menu.

```bash
alvin-bot tools list                   # Show installed / missing optional tools
alvin-bot tools install                # Interactive menu — pick which to install
```

### macOS permissions wizard (since 5.1.0)

macOS' TCC framework refuses to let any app grant Full Disk Access / Automation / Accessibility programmatically — only the user can flip those switches. The wizard makes the toggling experience painless: it detects every permission's current state, opens the **exact right Settings pane** for each missing one, waits for you to toggle (polling every 2 s for up to 60 s per permission), verifies, and moves on. Bundles sudo-password storage in the same upfront flow.

```bash
alvin-bot permissions status           # Quick status: all 4 permissions + current state
alvin-bot permissions wizard           # Interactive guided setup, one-and-done
alvin-bot permissions open <id>        # Open one Settings pane (full-disk-access / automation / accessibility)
alvin-bot perms                        # alias for permissions
```

### LaunchAgent (macOS only)

```bash
alvin-bot launchd install              # Write ~/Library/LaunchAgents/com.alvinbot.app.plist + load
                                       # (Also installs the dead-man-switch companion plist since 4.26.0)
alvin-bot launchd status               # Show PID + recent stdout/stderr from the LaunchAgent
alvin-bot launchd uninstall            # Unload + remove both plists
```

### Browser automation (bot-managed Chromium)

```bash
alvin-bot browser start                # Launch Chromium with CDP, persistent profile
alvin-bot browser start headful        # Same, visible (for login flows)
alvin-bot browser goto <url>           # Open URL, return JSON metadata
alvin-bot browser shot <url> [file]    # Screenshot → ~/.alvin-bot/browser/screenshots/
alvin-bot browser eval <url> "<js>"    # Run JS in page context
alvin-bot browser tabs                 # List open tabs
alvin-bot browser status               # PID + CDP endpoint
alvin-bot browser stop                 # Quit Chromium
alvin-bot browser doctor               # Diagnose Chromium / Playwright setup
```

### Maintenance & introspection

```bash
alvin-bot audit                        # Security health check — permissions, secrets, config
alvin-bot search "<query>"             # Search assets, memories, and skills index
```

### Environment-variable opt-outs (Self-Preservation features since 4.26.0 / 5.0.0)

Granular opt-out for the resilience subsystems — everything is enabled by default:

```bash
ALVIN_DISABLE_SELF_PRESERVATION=true   # Kill ALL Phase-1 + Phase-2 features below
ALVIN_DISABLE_PREFLIGHT=true           # Skip startup sanity check (Telegram, provider, SQLite, disk)
ALVIN_DISABLE_CRITICAL_NOTIFY=true     # Skip cross-channel alerts (Telegram + macOS notif + file flag)
ALVIN_DISABLE_DEAD_MAN=true            # Skip the zombie-detection heartbeat writer
ALVIN_DISABLE_AUTO_DIAGNOSTIC=true     # Skip forensic-bundle writing on crash
ALVIN_DISABLE_SELF_DIAGNOSIS=true      # Skip AI analysis of forensic bundles at startup
ALVIN_DISABLE_TRENDS=true              # Skip daily trend snapshots + AI anomaly detection
ALVIN_DEADMAN_THRESHOLD_SEC=600        # Dead-man's-switch staleness threshold (default 10 min)
ALVIN_TRENDS_INTERVAL_HOURS=24         # Trend-snapshot cadence (default 24 h)
ALVIN_TRENDS_AI_AFTER_DAYS=7           # Days of history before AI anomaly detection kicks in
```

---

## 🗺️ Roadmap

> Per-version details: see [`CHANGELOG.md`](CHANGELOG.md). The roadmap is a forward-looking summary, not a changelog.

### ✅ Recently shipped

| Version | Theme | Highlights |
|---|---|---|
| **v5.44.1** *(June 2026)* | Sleep-aware crash-backstop | The external crash-backstop that restarts a frozen bot is now sleep-aware too (the in-app watchdog already was, since v5.33): it clamps the staleness window by how long the machine has actually been awake, so a laptop that merely slept past the threshold no longer triggers a spurious restart on wake. A genuine freeze on an awake machine is still recovered. Self-applies on next start; always-on desktops were never affected. |
| **v5.36** *(June 2026)* | No reconnect spam on wake | A laptop wake re-establishes the dropped Telegram connection silently instead of DMing a "subsystem restart" every time — recognised by the sleep-sized staleness / recent resume. A genuine runner failure still notifies. Completes the sleep-hardening series. |
| **v5.35** *(June 2026)* | Reliable update + sleep-calm | `/update` fixed for `npm install -g` (skips the flaky Puppeteer Chromium download) and survives a dropped network mid-update; watchdog grants a post-wake reconnect grace window; trend monitor no longer flags a frequently-sleeping laptop's reconnect noise as a crash loop. |
| **v5.34** *(May 2026)* | Budget cap off by default | The optional `MAX_BUDGET_USD` daily cap no longer blocks anyone unless explicitly opted in (`BUDGET_ENFORCE=1`). Fixes a surprise "$X spent today" wall users hit after updating (the shipped example config had set a $5 cap that used to be ignored). |
| **v5.33** *(May 2026)* | Laptop-sleep resilience | The health watchdog now detects a resume-from-suspend (large wall-clock jump between checks) and re-arms instead of restarting — no more sleep-induced restart loops or false "crash" alerts on laptops. A genuine hang is still caught on the next check. |
| **v5.32** *(May 2026)* | Maintainability split | Largest source files broken into smaller modules (command groups; the web dashboard's 3,300-line script → core/panels/views) with byte-verified identical behavior. |
| **v5.31** *(May 2026)* | Defense-in-depth | File-write tools refuse credential/persistence/system paths (`.env`, `~/.ssh`, `LaunchAgents`, `/etc`); external MCP servers get a secret-free environment (no bot token / API keys); web, search, and MCP output is fenced as untrusted "data, not instructions" with injection heuristics; documented threat model in `SECURITY.md`. |
| **v5.30** *(May 2026)* | Maintainability + watch mode | `npm run dev:watch` / `build:watch` for source development; the largest command file split into per-area modules (cron, sub-agents) with no behaviour change. |
| **v5.29** *(May 2026)* | Sub-agent visibility | `/subagents list` marks detached agents that will auto-continue when they finish with a ↪️. |
| **v5.27 – v5.28** *(May 2026)* | Security + stability hardening | Web UI requires a password on **every** channel (incl. the live chat socket) when exposed; pattern-based destructive-command guard for the shell tool; plain-HTTP exposure warning; opt-in `MAX_BUDGET_USD` daily cap actually enforced; clear startup messages for common misconfigs; dropped external MCP tool-servers auto-reconnect with backoff. |
| **v5.25 – v5.26** *(May 2026)* | Dashboard + auto-continuation | `/status` rebuilt as a clean, human-readable dashboard with provider-aware limits and reasoning mode; a finished background task can auto-run a promised follow-up, toggled per chat with `/continuation`. |
| **v5.24.7** *(May 2026)* | Thinking-loop self-heal | An extended-thinking API error that could survive a `/restart` (poisoned session anchor) now auto-resets the SDK session on detection — conversation history preserved, no `/new` needed. |
| **v5.24.6** *(May 2026)* | Restart hotfix | v5.24.5's explicit self-relaunch is now a **fallback**: the bot waits a grace window for the OS service manager to respawn it on its own, and only relaunches itself if it doesn't. Fixes the double-spawn (service-manager respawn + explicit relaunch) that briefly ran two instances → Telegram 409 → `/restart` reconnect loop on healthy setups. All self-preservation layers unchanged. |
| **v5.24.5** *(May 2026)* | Self-relaunching restart | Self-restarts now schedule their own relaunch after a clean exit instead of waiting on the OS service manager, whose automatic respawn can stall on long-uptime sessions and leave the bot down after a `/restart` or `/update`. The optional crash-backstop watcher also self-installs on boot if missing (covers setups that only ever updated in place). |
| **v5.24.4** *(May 2026)* | launchd-safe restart | `/restart` + `/update` route through `scheduleGracefulRestart()` for an orderly teardown (Telegram offset commit + long-poll close, port 3100 release, WhatsApp socket close) instead of a bare PM2-era `process.exit(0)` that left lingering resources blocking the launchd respawn on WhatsApp-enabled installs. restart-storm.json test-pollution closed at the source (lazy path + test isolation). |
| **v5.24.3** *(May 2026)* | Quieter logs | Trends-AI debounce (4 h signature cache) — restart-storms no longer fire one redundant AI call per restart with identical to-be-suppressed verdict. Ollama preload silent when the daemon isn't running — public users without Ollama don't see a "[ollama] preload warning" red flag on every boot. |
| **v5.24.2** *(May 2026)* | Files born private | `delivery-queue.json`, `cron-jobs.json`, and `memory/YYYY-MM-DD.md` daily logs now write with `mode: 0o600` explicitly so the startup file-permissions audit no longer needs to repair them on every boot. The recurring `🔒 file-permissions: repaired N` log line stops appearing in normal operation. |
| **v5.24.1** *(May 2026)* | Test-pollution sentinel | Vitest `globalSetup` snapshots production state files before any test file loads, re-checks SHA-256 after the suite — `npm test` now fails non-zero on any future test that writes into the user's real `~/.alvin-bot/`. Closes the regression vector that v5.24.0 mitigated only for the two known polluters. |
| **v5.24** *(May 2026)* | Beast Mode | `lastSdkHistoryIndex` reindexed after every compaction (latent bridge-anchor bug closed). SDK-path compaction enabled — the local history shadow no longer balloons and long SDK sessions feed the memory pipeline. Trend-WARN noise floor raised: malformed AI responses no longer become `(no description)` alarms, suggestions are platform-aware (`journalctl` only on systemd-linux), stale installs get an explicit `/update first` hint. Sanitizer-residue gate widened 0.25 → 0.5 kept-ratio so the canonical 38 %-kept incident shape is actually caught. Bonus: three test files no longer pollute production state. |
| **v5.23** *(May 2026)* | No-more-hangs, never-lose-the-thread | Stalled-output guard for the async-agent watcher: a sub-agent that silently dies mid-run (OOM, parent-restart orphan, re-fork that strands the parent pid) is declared failed after 5 min of no output-file writes — releases `pendingBackgroundCount` and frees the bot from bypass-bridge mode. Pre-bypass compaction: the bridge inject ships a stable narrative summary + last 10 turns instead of every raw turn, so the prompt is small and Anthropic's prompt-cache hits across consecutive bypass-turns (TTFT 5-15 s → sub-second on turn 2+). Sanitizer-residue guard against role-confused history: a leaked `<system-reminder>` block collapsed to a tiny quote-fragment no longer lands as `role: assistant`. |
| **v5.22** *(May 2026)* | Hardcore foundation — dedup + ambient memory + privacy | Sub-agent doppel-spawn-guard (60s window, prompt-hash dedup); `/cancel` + `/stopall` cover detached subprocesses too with PID-command-shape verification; ambient memory search on EVERY turn (not just first SDK turn) — the USP fix; privacy-check.sh now scans git-tracked source in addition to tarball; internal planning docs removed from git tracking. |
| **v5.21** *(May 2026)* | Context-continuity restored | Bypass-bridge no longer caps at 10 turns (was the main USP-breaker mid-conversation); bridge size cap raised 2.5 k → 32 k so a 30-turn detailed session survives intact; background-agent results carry a structural `[BACKGROUND-AGENT RESULT — async side-channel]` marker so parallel agents can't hijack the conversation thread; standing rule in CLAUDE.md to check memory/history before asking the user to repeat. |
| **v5.20** *(May 2026)* | Sub-agent results integrate into history | Telegram reply-quotes on background-agent results now keep context — the result is written to the parent session's history at delivery time (inline / multi-chunk / file tiers all handled), capped at 4 kB per entry. Eliminates the "generic A/B/C answer after reply-quote" pattern. |
| **v5.19** *(May 2026)* | Reliability surface | New `Reliability` tab in the WebUI: restart-storm counter, state-file health, recent incidents, `EXEC_SECURITY` mode, live TTS-provider switch. Subsystem restarts (telegram-poll in-process) DM the owner instead of being silent. State-quarantine findings DM after boot. Incident dumps auto-prune after 30 days. |
| **v5.18** *(May 2026)* | Reliability foundation | Cron auto-fix (silent allowlist + auto-wrap for piped commands) restores the "create job, it runs" feel; `launchctl` / `pkill` / direct-kill-our-pid routed through graceful restart instead of OS SIGKILL; restart-storm-brake refuses the 4th self-restart in 10 min; post-restart "🦊 Wieder da" ping after every non-cold-boot; state-file health check at boot quarantines corrupted `sessions.json` / `cron-jobs.json` / `delivery-queue.json`; new `EXEC_SECURITY=full` Power-Mode opt-in in `alvin-bot setup`. |
| **v5.17** *(May 2026)* | Scaffolding-echo hardening + local TTS default | Four independent sanitizer layers (stream / handler / history-write / bridge) block a class of confused-model failure where the underlying LLM emitted `<system-reminder>` scaffolding as if it were a reply; the bot used to DM it verbatim AND re-inject it via the bridge, self-reinforcing. Plus Supertonic local TTS is now the default for fresh installs (lazy auto-install on update), Unicode-safe text path, `/tts <provider>` to switch live. |
| **v5.15 – v5.16** *(May 2026)* | Update + observability hygiene | `/update` survives the `NODE_ENV=production` devDeps trap and stays bilingual; outbound log scrubber now exempts the owner so you can still see your own `.env` contents in DM; cross-platform CI matrix runs on an injected clock so a stable test suite no longer depends on the runner's timezone. |
| **v5.14** *(May 2026)* | Concurrent updates → `/btw` live steering works | Updates are now processed concurrently via the grammy runner (normal messages still serialized per chat), so a `/btw` note — or a mid-task `/stop` — is received and applied *while* a task streams, instead of queuing behind it. Live steering uses the Claude-SDK provider. |
| **v5.13** *(May 2026)* | Prompted-tool sub-agents (opt-in) | `SUBAGENT_PROMPTED_TOOLS=1` lets models without native function-calling run background sub-agents via a strict prompted protocol — capped, honest best-effort degradation, same exec guardrails. Off by default (zero change unless enabled). Completes provider-agnostic background sub-agents. |
| **v5.12** *(May 2026)* | Non-Claude background sub-agents | Function-calling providers (Groq, OpenRouter, NVIDIA NIM, Gemini, capable Ollama) now run background sub-agents via a detached worker reusing the same tool layer + exec guardrails + delivery/reconciliation. Claude-SDK path byte-for-byte unchanged; non-function-calling models keep the graceful fallback. |
| **v5.11** *(May 2026)* | Provider-agnostic sub-agent groundwork | Internal dispatch backend seam; on installs without the `claude` CLI a background-task request now returns an honest message via the normal result path instead of erroring. Claude-SDK path byte-for-byte unchanged. Follow-ups add a generic non-Claude worker. |
| **v5.10** *(May 2026)* | Answer-detail control + real `/btw` | `/verbosity short\|medium\|full` (medium = unchanged default) applied uniformly across all providers; `/btw` is now an actual discoverable command on every messenger — steers a live Claude-SDK task or replies honestly instead of the old silent no-op. |
| **v5.9** *(May 2026)* | Quieter, honest health alerts + fixes | Trend monitor no longer false-alarms after an update (ignores version-change churn, requires a real recurring fault, 24 h debounce); macOS sudo/keychain setup fixed (was hanging with "exit null"); 7-day screenshot auto-cleanup now covers the CLI temp dir + on-demand `/cleanup`. |
| **v5.7 – v5.8** *(May 2026)* | Distribution hardening | Leaner, more tamper-resistant published package; local dev stays fully readable. Verified by fresh global install on a clean machine. |
| **v5.3 – v5.6** *(May 2026)* | Control + delivery UX | Live `btw …` steering of a running task, honest instant ⛔ Stop with confirmation, compact background-task delivery (tight header + answer, long output as one file), 48 h monitor self-heal. |
| **v5.0 – v5.2** *(May 2026)* | Reliable mid-task stop + hardening | Dependable `/cancel` (soft) and `/stopall` (hard) with an inline ⛔ button; whole-project security + bug-hardening pass. |
| **v4.22** *(May 2026)* | Memory architecture overhaul | Pluggable embedding providers — **Gemini · OpenAI · Ollama · FTS5 (zero-config keyword fallback)**. Auto-detection picks the best available, so users with no API key still get a working indexed memory store. Smart inject mode stops bulk-injecting `MEMORY.md` once SQLite is populated. |
| **v4.21** | Agent Browser skill | Tier-1.5 token-efficient web automation via the [agent-browser](https://github.com/vercel-labs/agent-browser) CLI — opt-in by install. ~90 % token reduction vs Playwright on cooperative pages. |
| **v4.20** | SQLite-backed vector memory | Replaces the legacy 128 MB JSON index. Automatic migration on first start, per-chunk INSERT/UPDATE, lazy native binary load with graceful fallback. |
| **v4.18 – v4.19** | Reliability + per-workspace overrides | SDK auto-recovery on token rotation / quota exhaustion / empty streams. Per-workspace `effort` / `provider` / `voice` / `temperature` / `toolset`. |
| **v4.17** | Hardening audit | Disk cleanup service, hardening fixes from internal audit. |
| **v4.13 – v4.14** | Detached sub-agents | `alvin_dispatch_agent` MCP tool spawns independent `claude -p` subprocesses that survive parent aborts. Multi-platform dispatch (Slack / Discord / WhatsApp). Watcher zombie guard. |
| **v4.10 – v4.12** | Multi-session + Slack | Workspace registry with hot-reload, per-channel personas + cwd, Slack adapter with progress ticker + typing status, owner approval gate, async sub-agents. |

### 🏛️ Foundations (built before v4.10)

Multi-model provider abstraction with fallback chains · plugin & skill ecosystems with hot-reload · multi-platform adapters (Telegram, WhatsApp, Discord, Signal, Slack) · Web UI with i18n + command palette · native macOS `.dmg` via Electron · Docker Compose · npm distribution · MCP client + custom tools · universal tool use across providers · full media pipeline (audio · video · photo · voice).

### 🎯 On the radar

| Priority | Item | Why |
|---|---|---|
| **P0** | MCP plugin sandboxing | MCP servers currently run with full Node privileges. Plan: child process with restricted FS + network policy (deno-permission style). Architectural change. |
| **P1** | Electron major upgrade (35 → 41+) + Windows `.exe` | Closes desktop-build CVEs, unblocks the only platform still missing a native installer. |
| **P1** | Prompt injection defense policy | Needs a design decision (heuristic filter / allow-list / accept-the-risk with clearer warnings) and consistent enforcement at every message entry point. |
| **P2** | Per-workspace memory layer | Facts learned in one workspace stay scoped unless explicitly promoted. Builds on the v4.22 SQLite store. |
| **P2** | Per-workspace skill allowlist | Scope Apple Notes to personal workspace, sysadmin tools to devops only, etc. |
| **P2** | Multi-user Slack (`per-channel-peer`) | Different users in the same Slack channel get their own sub-sessions. |
| **P2** | Live steering (`/btw`) for file, photo & document tasks | Attachments run on a handler path that never opens a steer channel, so `/btw` can't nudge them mid-task yet (it replies honestly instead). Plan: extend the streaming steer setup to the attachment handlers. |
| **P3** | Linux `.AppImage` / `.deb`, Homebrew formula, Scoop manifest, one-line install script | Platform reach for non-npm users. |
| **P3** | Daily-log decay / archive | Older daily logs move to cold storage after N days. |
| **P3** | Workspace cloning / templates | `/workspace clone my-project as my-fork` spins up a new workspace from an existing one. |
| **P3** | TypeScript 5 → 6 | 5.x still supported; strict-mode break-fix work, not urgent. |

Pull requests welcome — see [`CONTRIBUTING.md`](CONTRIBUTING.md).

---

## 🔒 Security

> ### ⚠️ Important: Alvin has full shell + filesystem access
>
> Alvin Bot is an **autonomous AI agent** built on the Claude Agent SDK with shell, filesystem, and network access to the machine it runs on. This is by design — it's the point of the project. But it means:
>
> - **Treat the bot like `sudo` access** — only install it on machines where you'd trust Claude Code to run without supervision.
> - **Never expose the Web UI (port 3100) to the internet** without HTTPS, rate limiting, and a strong `WEB_PASSWORD`. It binds to `localhost` by default.
> - **On multi-user systems**, verify `~/.alvin-bot/.env` is chmod `600` (v4.12.2+ enforces this automatically on startup).
> - **`ALLOWED_USERS` is your first line of defense** — v4.12.2+ refuses to start if it's empty and Telegram is enabled.
>
> **Read the full threat model and hardening guide:** [`docs/security.md`](docs/security.md)

### Access control

- **User whitelist** — Only `ALLOWED_USERS` can interact with the bot (hard-enforced at startup since v4.12.2)
- **WhatsApp group approval** — Per-group participant whitelist + owner approval gate via Telegram (with WhatsApp DM / Discord / Signal fallback). Group members never see the approval process.
- **Slack allowlist** — `SLACK_ALLOWED_USERS` restricts who can DM or @mention the bot in Slack
- **DM pairing** — Optional 6-digit code flow for new users via owner approval (`AUTH_MODE=pairing`)

### Execution hardening

- **`EXEC_SECURITY=allowlist`** (default) — Shell commands must match a whitelist of safe binaries and **cannot contain shell metacharacters** (`;`, `|`, `&`, `` ` ``, `$(...)`, redirects). Rejected by v4.12.2's exec-guard metachar filter.
- **Cron shell jobs** go through the same exec-guard (v4.12.2+) — cron is no longer a bypass vector.
- **Sub-agent toolset presets** — spawn sub-agents with `toolset: "readonly"` or `"research"` to restrict what they can do, regardless of the parent's privileges.
- **Timing-safe webhook auth** — `POST /api/webhook` uses `crypto.timingSafeEqual` (v4.12.2+) to prevent timing side-channel token extraction.

### Data hardening

- **Self-hosted** — Your data stays on your machine. No cloud sync, no external logging of prompts or responses.
- **No telemetry** — Zero tracking, zero analytics, zero phone-home.
- **File permissions** — `.env`, `sessions.json`, memory logs, cron jobs, and all sensitive state files are chmod `0o600` on every write and repaired at startup (v4.12.2+).
- **Owner protection** — Owner account cannot be deleted via UI.
- **Encrypted sudo credentials** — If you enable sudo exec, passwords are stored encrypted with an XOR key in a separate file, both chmod `0o600`.

### Known limitations (documented honestly)

- **Prompt injection** cannot be reliably filtered — we document this as a capability tradeoff rather than pretending to solve it. See `docs/security.md` for the full discussion.
- **Not yet hardened for public-internet deployment** — current scope is "on your own machine". VPS deployment works but requires additional reverse-proxy + TLS + rate-limit setup that we don't automate.
- **Electron Desktop build** has known CVEs (Phase 18 roadmap). The primary distribution is npm global install, not Desktop — if you don't use the Desktop wrapper, you're not affected.

---

## 📄 License

MIT — See [LICENSE](LICENSE).

---

## 🤝 Contributing

Issues and PRs welcome! Please read the existing code style before contributing.

```bash
git clone https://github.com/alvbln/Alvin-Bot.git
cd alvin-bot
npm install
npm run dev    # Development with hot reload
```
