# Cortex

A high-performance, locally-hosted AI memory system that stores, indexes, and retrieves contextual knowledge via MCP (Model Context Protocol). Designed for Claude Code, but works with any MCP client.

## Installation

```bash
npm install -g @londer/cortex
```

Or from source:

```bash
git clone https://github.com/londer/cortex.git
cd cortex
npm install
npm run build
```

## Quick Start

```bash
# 1. Set up storage backends (Qdrant + Neo4j via Docker)
cortex setup

# 2. Start the MCP server
cortex serve
```

## Claude Code Integration

```bash
# Add Cortex to Claude Code
claude mcp add cortex -- cortex serve

# Generate Claude Code instructions for your project
cortex init

# Or set up global instructions (applies to all projects)
cortex init --global
```

Or manually add to your MCP config:

```json
{
  "servers": {
    "cortex": {
      "command": "cortex",
      "args": ["serve"]
    }
  }
}
```

## LLM Setup (Optional)

Cortex works fully without an API key. Adding one unlocks higher-quality extraction and smart consolidation.

```bash
# Via CLI
cortex config set anthropic_api_key sk-ant-your-key-here

# Or via environment variable
export ANTHROPIC_API_KEY=sk-ant-your-key-here
```

### Ollama (Local LLM)

For privacy-first LLM extraction without an API key, install [Ollama](https://ollama.ai):

```bash
# Check Ollama status
cortex ollama status

# Pull the default model
cortex ollama pull

# Or configure a different model
cortex config set ollama_model mistral
```

## Extraction Tiers

Cortex uses a tiered entity extraction system:

| Tier | Method | Quality | Latency | Cost | Requirements |
|------|--------|---------|---------|------|-------------|
| 1 | Regex + Heuristics | Basic | < 1ms | Free | Always available |
| 2 | NLP (compromise.js) | Good | < 50ms | Free | Always available |
| 2.5 | Ollama (local LLM) | Good+ | 1-5s | Free | Ollama running locally |
| 3 | LLM (Claude API) | Best | 500ms-2s | ~$0.002/call | API key required |

The system automatically selects the best available tier (3 → 2.5 → 2 → 1) and falls back gracefully.

## Capability Matrix

| Feature | No API Key | With Ollama | With API Key |
|---------|-----------|-------------|-------------|
| Entity extraction | Tier 1-2 (regex + NLP) | Tier 2.5 (local LLM) | Tier 3 (Claude) |
| Relationship detection | Basic verb patterns | LLM-powered | Full semantic understanding |
| Auto-extraction | Local heuristics | Ollama-powered | LLM-powered analysis |
| Consolidation: near-identical dedup | Yes (> 0.95 similarity) | Yes | Yes |
| Consolidation: smart merge | Flagged for review | Yes | Yes |
| Consolidation: contradiction resolution | No | Yes | Yes |
| memory_ingest | Tier 1-2 extraction | Tier 2.5 extraction | Tier 3 extraction |
| Cross-project sharing | Full access | Full access | Full access |
| Access tracking & stats | Full access | Full access | Full access |

## Available Tools

| Tool | Description |
|------|-------------|
| `memory_store` | Store a memory. Auto-embeds content, stores metadata, links entities. |
| `memory_search` | Semantic search with scope boosting. |
| `memory_relate` | Create entity relationships in the knowledge graph. |
| `memory_graph_query` | Traverse the knowledge graph from an entity. |
| `memory_context` | Smart context retrieval combining semantic + graph + recency. |
| `memory_forget` | Delete a memory from all stores (requires confirmation). |
| `memory_ingest` | Ingest raw text and extract memories/entities/relationships. |
| `memory_consolidate` | Merge redundant memories. Supports dry_run preview. |
| `memory_config` | View/modify runtime configuration (including API key). |
| `memory_stats` | Aggregate statistics: totals, breakdowns, access patterns, staleness. |
| `memory_share` | Cross-project sharing: promote visibility, link/unlink projects. |

## CLI Reference

```
cortex serve                      Start the MCP server (stdio transport)
cortex setup                      Start production Qdrant + Neo4j (ports 16333/17687)
cortex setup --dev                Start dev Qdrant + Neo4j (ports 26333/27687)
cortex setup --stop               Stop containers
cortex init                       Generate Cortex instructions for project CLAUDE.md
cortex init --global              Generate global instructions (~/.claude/CLAUDE.md)
cortex config get                 Show current configuration
cortex config set <key> <value>   Set a runtime config value
cortex consolidate                Run manual consolidation
cortex stats                      Show memory statistics and staleness info
cortex ollama status              Check Ollama availability and installed models
cortex ollama pull [model]        Pull an Ollama model
cortex version                    Show version
cortex help                       Show help
```

## Configuration

Configuration is loaded from environment variables with sensible defaults. See `.env.example` for all options.

Key settings:

| Variable | Default | Description |
|----------|---------|-------------|
| `QDRANT_URL` | `http://localhost:16333` | Qdrant server URL |
| `NEO4J_URI` | `bolt://localhost:17687` | Neo4j bolt URI |
| `SQLITE_PATH` | `~/.cortex/cortex.db` | SQLite database path |
| `ANTHROPIC_API_KEY` | *(empty)* | Anthropic API key (optional) |
| `CORTEX_EXTRACTION_TIER` | `auto` | `auto`, `local-only`, or `llm-preferred` |
| `CORTEX_AUTO_EXTRACT` | `true` | Enable auto-extraction from conversations |
| `CORTEX_CONSOLIDATION_ENABLED` | `true` | Enable periodic memory consolidation |
| `CORTEX_STALENESS_THRESHOLD_DAYS` | `90` | Days before a memory is considered stale |
| `CORTEX_OLLAMA_ENABLED` | `true` | Enable Ollama local LLM |
| `CORTEX_OLLAMA_URL` | `http://localhost:11434` | Ollama server URL |
| `CORTEX_OLLAMA_MODEL` | `llama3.2` | Ollama model name |

Runtime overrides persist to `~/.cortex/runtime-config.json`.

## Development

```bash
# Run with hot reload
npm run dev

# Run tests (requires Docker backends running)
npm test

# Build
npm run build

# Release (bump version, update changelog, create tag)
npm run release
```

## Architecture

```
Claude Code
    │ MCP (stdio)
    ▼
┌──────────────────────────────────────┐
│          Cortex MCP Server           │
│            (TypeScript)              │
├──────────────────────────────────────┤
│  Extraction    │ Consolidation       │
│  Tier 1-3+     │ LLM / Ollama / Local│
│  Auto-extract  │ Cluster + Merge     │
├──────────────────────────────────────┤
│         Orchestration Layer          │
│  Scope inference, ranking, dedup     │
├─────────┬──────────┬─────────────────┤
│ Qdrant  │  Neo4j   │     SQLite      │
│ vectors │  graph   │    metadata     │
└─────────┴──────────┴─────────────────┘
         ▲                    ▲
  HuggingFace            Anthropic API
  Transformers.js         (optional)
                          Ollama
                          (optional)
```

## License

MIT - Arthur Lonfils / Londer