---
name: jcodemunch
short_name: jcode
description: JCodeMunch - Token-efficient code retrieval via structured symbol indexing
category: development
priority: P0
tags: [jcodemunch, token-efficiency, code-retrieval, ast-parsing, structured-search]
---

# JCodeMunch Skill

**JCodeMunch** (jcodemunch-mcp) is the most token-efficient MCP server for GitHub source code exploration. It indexes codebases once using tree-sitter AST parsing, then enables precise symbol-level retrieval.

## Core Philosophy

> **"Cut code-reading token costs by 95% or more"**

Traditional AI agents explore repositories the expensive way:
- Open entire files → skim thousands of irrelevant lines → repeat
- **That is not "a little inefficient." That is a token incinerator.**

**JCodeMunch indexes a codebase once and lets agents retrieve only the exact code they need:**

| Task | Traditional approach | With JCodeMunch |
| --- | --- | --- |
| Find a function | ~40,000 tokens | ~200 tokens |
| Understand module API | ~15,000 tokens | ~800 tokens |
| Explore repo structure | ~200,000 tokens | ~2k tokens |

**Index once. Query cheaply forever. Precision context beats brute-force context.**

---

## When to Use JCodeMunch

**Good for:**
- ✅ Large multi-module repositories
- ✅ Unfamiliar codebases
- ✅ Agent-driven code exploration
- ✅ Refactoring and impact analysis
- ✅ Onboarding new developers
- ✅ Teams trying to cut AI token costs
- ✅ Developers tired of paying premium rates for glorified file scrolling

**Not good for:**
- ❌ LSP diagnostics or completions (use your LSP directly)
- ❌ Editing workflows (use your editor)
- ❌ Real-time file watching
- ❌ Semantic program analysis
- ❌ Cross-repository global indexing

---

## Available Tools

### 1. index_repo

**Purpose:** Index a GitHub repository for symbol-level retrieval

**Usage:**
```
index_repo: { "url": "owner/repo" }
```

**What happens:**
- Clones repository locally
- Parses all source files with tree-sitter
- Extracts symbols with metadata: signature, kind, qualified name, one-line summary
- Stores byte offsets into original file for O(1) seeking
- Creates structured index in `~/.code-index/`

**Example:**
```
index_repo: { "url": "geekcomputers/Python" }
```

---

### 2. index_folder

**Purpose:** Index a local folder for symbol-level retrieval

**Usage:**
```
index_folder: { "path": "/path/to/project" }
```

**What happens:**
- Walks folder structure
- Parses all source files with tree-sitter
- Extracts symbols with metadata
- Creates structured index stored locally

**Example:**
```
index_folder: { "path": "/home/mdwzrd/wzrd.dev" }
```

---

### 3. get_file_outline

**Purpose:** Get symbol hierarchy for a file (understand structure before pulling source)

**Usage:**
```
get_file_outline: { "repo": "owner/repo", "file_path": "src/main.py" }
```

**Returns:**
- All symbols in file
- Hierarchical structure (classes → methods, imports → used symbols)
- Symbol metadata (kind, name, signature)

**Example:**
```python
get_file_outline: { "repo": "owner/repo", "file_path": "src/auth.py" }

# Returns structure like:
classes:
  - name: UserService
    kind: class
    methods:
      - name: login
        signature: authenticate(username, password) -> token
      - name: logout
        signature: () -> None
```

---

### 4. get_file_content

**Purpose:** Retrieve specific lines from a file (when you need context around a symbol)

**Usage:**
```
get_file_content: {
  "repo": "owner/repo",
  "file_path": "src/main.py",
  "start_line": 10,
  "end_line": 25
}
```

**Returns:**
- Exact line range requested
- Full file content for those lines
- Preserves indentation and formatting

**When to use:**
- After getting `get_file_outline` to understand structure
- To see implementation details for specific symbols
- When you need context around a method/function

---

### 5. get_symbol

**Purpose:** Retrieve full source for a specific symbol

**Usage:**
```
get_symbol: { "repo": "owner/repo", "symbol_id": "src/main.py::UserService.login#method" }
```

**Returns:**
- Full implementation of the symbol
- With surrounding context (imports, dependencies)
- O(1) byte-offset seeking directly to the line in source file

**Symbol ID format:**
```
{file_path}::{qualified_name}#{kind}
```

Kinds: `function`, `method`, `class`, `variable`, `constant`, `type`, `import`

**Example:**
```python
get_symbol: { "repo": "owner/repo", "symbol_id": "src/auth.py::AuthenticationService.verify_token#method" }
```

---

### 6. search_symbols

**Purpose:** Search for symbols by name, kind, or language

**Usage:**
```
search_symbols: { "repo": "owner/repo", "query": "authenticate" }
```

**Parameters:**
- `repo`: Repository identifier
- `query`: Symbol name to search for
- Optional filters: `kind`, `language`, `case_sensitive`

**Returns:**
- All matching symbols
- Their locations and signatures
- One-line summaries

**Example:**
```
search_symbols: { "repo": "owner/repo", "query": "login" }

# Returns all login-related symbols:
[
  {
    symbol_id: "src/auth.py::UserService.login#method",
    kind: "method",
    qualified_name: "UserService.login",
    file_path: "src/auth.py",
    start_byte: 1420,
    end_byte: 1567
  },
  {
    symbol_id: "src/auth.py::TokenManager.validate#method",
    kind: "method",
    qualified_name: "TokenManager.validate",
    file_path: "src/auth.py",
    start_byte: 3200,
    end_byte: 3450
  }
]
```

---

### 7. search_text

**Purpose:** Full-text search with configurable context (when structure alone isn't enough)

**Usage:**
```
search_text: { "repo": "owner/repo", "query": "TODO", "context_lines": 1 }
```

**Parameters:**
- `repo`: Repository identifier
- `query`: Text to search for
- `context_lines`: Number of lines before/after match (default: 1)

**When to use:**
- Symbol search doesn't find what you need
- Searching for comments, docstrings, or unstructured code
- When implementation details are more important than structure

**Warning:** This retrieves full file content, so it's more expensive than symbol search. Use `search_symbols` first.

---

### 8. get_repo_outline

**Purpose:** High-level repository overview (files, modules, dependencies)

**Usage:**
```
get_repo_outline: { "repo": "owner/repo" }
```

**Returns:**
- File tree structure
- Module organization
- Import/dependency relationships
- Key entry points

**When to use:**
- First time exploring a codebase
- Understanding overall architecture
- Finding main application files

---

### 9. list_repos

**Purpose:** List all indexed repositories

**Usage:**
```
list_repos: {}
```

**Returns:**
- All indexed repositories
- Their metadata (indexed_at, file count, symbol count)

**When to use:**
- See what's available for search
- Check indexing status
- Re-index if needed

---

### 10. get_symbols

**Purpose:** Batch retrieve multiple symbols (when you need context for several related symbols)

**Usage:**
```
get_symbols: {
  "repo": "owner/repo",
  "symbol_ids": [
    "src/main.py::UserService.login#method",
    "src/main.py::TokenManager.validate#method"
  ]
}
```

**Returns:**
- Full implementation for all requested symbols
- With dependencies and context

**When to use:**
- Understanding a complete API surface
- Getting multiple related methods at once
- More efficient than multiple individual `get_symbol` calls

---

### 11. invalidate_cache

**Purpose:** Remove cached index and force re-index

**Usage:**
```
invalidate_cache: { "repo": "owner/repo" }
```

**When to use:**
- Codebase has changed significantly
- Previous index is outdated
- Need fresh symbol data

---

### 12. get_file_tree

**Purpose:** Get complete file tree structure for repository

**Usage:**
```
get_file_tree: { "repo": "owner/repo", "path": "src" }
```

**Returns:**
- Complete directory tree
- File sizes and types
- Last modified times

**When to use:**
- Understanding project layout
- Finding where specific modules are located
- Identifying large files that might need attention

---

## JCodeMunch Response Format

Every tool response includes a `_meta` envelope with timing, token savings, and cost avoided:

```json
{
  "_meta": {
    "timing_ms": 4.3,
    "tokens_saved": 4853,
    "total_tokens_saved": 128087,
    "cost_avoided": {
      "claude_opus": 1.20,
      "gpt5_latest": 0.48
    },
    "total_cost_avoided": {
      "claude_opus": 32.02,
      "gpt5_latest": 12.81
    }
  }
}
```

These accumulate across all tool calls and persist to `~/.code-index/_savings.json`.

---

## Workflow Examples

### Example 1: Find a function

**Task:** Locate the authentication function in the codebase

```
# Step 1: Search for symbols
search_symbols: { "repo": "owner/repo", "query": "authenticate" }

# Step 2: Get the specific symbol
get_symbol: { "repo": "owner/repo", "symbol_id": "src/auth.py::AuthenticationService.verify#method" }
```

**Result:** ~250 tokens total vs ~7,500 for traditional approach

---

### Example 2: Understand a module

**Task:** Understand the UserService API

```
# Step 1: Get file outline
get_file_outline: { "repo": "owner/repo", "file_path": "src/users.py" }

# Step 2: Get specific methods
get_symbols: {
  "repo": "owner/repo",
  "symbol_ids": [
    "src/users.py::UserService.create#method",
    "src/users.py::UserService.update#method",
    "src/users.py::UserService.delete#method"
  ]
}
```

**Result:** ~1,200 tokens vs ~15,000 for traditional approach

---

### Example 3: Onboarding a new codebase

**Task:** Understand the overall structure of a new project

```
# Step 1: Get repo outline
get_repo_outline: { "repo": "owner/repo" }

# Step 2: Get file tree for main directory
get_file_tree: { "repo": "owner/repo", "path": "src" }

# Step 3: Search for entry points
search_symbols: { "repo": "owner/repo", "query": "main" }
```

**Result:** ~3,000 tokens vs ~200,000+ for traditional approach

---

## Supported Languages

| Language | Extensions | Symbol Types |
|-----------|-----------|--------------|
| Python | `.py` | function, class, method, constant, type |
| JavaScript | `.js`, `.jsx` | function, class, method, constant, type |
| TypeScript | `.ts`, `.tsx` | function, class, method, constant, type |
| Go | `.go` | function, method, type, constant |
| Rust | `.rs` | function, impl, type, constant |
| Java | `.java` | method, class, type, constant, record |
| PHP | `.php` | function, class, type, constant |
| C# | `.cs` | class, method, type, record |
| C | `.c` | function, type, constant |
| C++ | `.cpp`, `.cc`, `.cxx`, `.hpp`, `.hxx`, `.h` | function, class, type, constant |
| Elixir | `.ex`, `.exs` | class (module/impl), type, function |
| Ruby | `.rb`, `.rake` | class, type, function |
| Dart | `.dart` | function, class, method, type |
| C/C++ Header | `.h`* (parsed as C++ first) | class, method, type, constant |

---

## Best Practices

### 1. Symbol Search First

Always try `search_symbols` before `search_text`:
- Symbol search is ~5× more efficient
- Only fall back to text search when structure isn't enough
- `search_text` retrieves full file content, which is expensive

### 2. Get Outlines Before Content

Use `get_file_outline` to understand structure:
- Shows all symbols in file
- Reveals hierarchy and organization
- Helps you choose the right symbol
- Only then use `get_file_content` for specific lines

### 3. Batch Related Symbols

Use `get_symbols` for multiple related symbols:
- Single call returns multiple implementations
- Reduces MCP round-trips
- More efficient than sequential `get_symbol` calls

### 4. Use Context Lines Sparingly

In `get_file_content`, specify exact line ranges:
- `start_line` and `end_line` should be as tight as possible
- Only request what you actually need
- Don't grab "extra lines just in case"

### 5. Index Once, Query Forever

Don't re-index on every query:
- Index is stored locally and persists
- Subsequent queries are instant
- Only re-index when codebase changes significantly

---

## Integration with Gold Standard

JCodeMunch enforces Gold Standard principles:

| Gold Standard Rule | JCodeMunch Application |
|-------------------|----------------------|
| Read-Back Verification | Verify symbol IDs match returned implementations |
| Executable Proof | Show tool outputs with token savings |
| Loop Prevention | Use `_meta` feedback to refine queries |
| Evidence Over Claims | Display `tokens_saved` from `_meta` envelope |

---

## Anti-Patterns

What JCodeMunch **refuses**:

1. **Brute-Force File Reading**
   - Don't `Read` entire files to find one function
   - Use `search_symbols` → `get_symbol` instead

2. **Unnecessary Full Content Requests**
   - Don't request 100 lines when you need 5
   - Use tight line ranges in `get_file_content`

3. **Skipping Symbol Search**
   - Always try `search_symbols` first
   - Only use `search_text` when structure isn't enough

4. **Ignoring Indexes**
   - Check if repo is already indexed before re-indexing
   - Use `list_repos` to see what's available

---

## Philosophy Summary

JCodeMunch embodies:
1. **Precision > Brute Force** - Exact symbol beats fuzzy scanning
2. **Index Once, Query Forever** - Upfront cost, cheap access
3. **Structure > Volume** - Outlines and hierarchies over full files
4. **Token Economics** - Every token should earn its place

**"Agents do not need bigger and bigger context windows. They need structured retrieval."**

---

## Installation & Configuration

**Prerequisites:**
- Python 3.10+
- pip

**Install:**
```bash
pip install jcodemunch-mcp
```

**Configure MCP Client:**
```bash
claude mcp add jcodemunch uvx jcodemunch-mcp
```

**Or configure for project-specific scope:**
```bash
claude mcp add --scope project jcodemunch uvx jcodemunch-mcp
```

**Restart Claude Code after adding server.**

---

**"Precision context beats brute-force context. Index once. Query cheaply forever."**
