# Security & Architecture

## Overview

This document describes the two major security and reliability improvements implemented in AI Live Log Bridge:

1. **Session Isolation** - Prevents interleaved outputs from concurrent commands
2. **Secret Redaction** - Automatically redacts sensitive information before logging

---

## 1. Session Isolation

### The Problem

When multiple commands run concurrently (e.g., `ai npm run dev` in one terminal and `ai npm test` in another), outputs would interleave in a single log file, making it impossible for AI to parse stack traces correctly.

**Before:**
```
[timestamp] Command: npm run dev
Server starting...
[timestamp] Command: npm test
Running tests...
Server started on port 3000    ← From first command
Test 1 passed                  ← From second command
FAIL: Test 2 failed            ← From second command
Listening on port 3000         ← From first command
```

### The Solution

Each command execution now gets:
- A **unique session ID** (format: `YYYYMMDDHHmmss-XXXX`, e.g., `20251122173109-a3f2`)
- A **dedicated log file** (`~/.mcp-logs/session-{ID}.log`)
- An entry in the **master index** (`~/.mcp-logs/master-index.log`)

**After:**
```
~/.mcp-logs/session-20251122173109-6ab4.log:
  [2025-11-22] Session: 20251122173109-6ab4
  [2025-11-22] Command: npm run dev
  Server starting...
  Server started on port 3000
  Listening on port 3000

~/.mcp-logs/session-20251122173109-7862.log:
  [2025-11-22] Session: 20251122173109-7862
  [2025-11-22] Command: npm test
  Running tests...
  Test 1 passed
  FAIL: Test 2 failed
```

### Implementation Details

**Files:**
- `src/session.ts` - Session ID generation and management
- `src/storage.ts` - Multi-file log reading
- `src/wrapper.ts` - Session-aware command wrapper

**Key Functions:**
- `generateSessionId()` - Creates unique session IDs
- `registerSession()` - Records session in master index
- `getSessionLogPath()` - Returns path for session log file
- `readRecentLogs()` - Reads from multiple session files intelligently

**Master Index Format:**
```
[2025-11-22T17:31:09.584Z] [20251122173109-6ab4] npm run dev
[2025-11-22T17:31:09.683Z] [20251122173109-7862] npm test
[2025-11-22T17:31:09.455Z] [20251122173109-d5eb] bash -c echo "test"
```

### Benefits

✅ **No Interleaved Outputs**: Each command's output is completely isolated
✅ **Clean Stack Traces**: AI can parse error messages correctly
✅ **Concurrent Execution**: Run multiple commands without interference
✅ **Session History**: Master index tracks all commands chronologically
✅ **Intelligent Reading**: `ai --last` intelligently aggregates from multiple sessions

---

## 2. Secret Redaction

### The Problem

Commands often output sensitive information that gets logged:
- API keys in environment variables
- Database connection strings with passwords
- Authentication tokens
- Private keys

This creates a **critical security vulnerability**: secrets get logged and potentially sent to LLMs.

**Before:**
```bash
$ echo "API_KEY=sk-1234567890abcdef"
API_KEY=sk-1234567890abcdef

# Log file contains:
API_KEY=sk-1234567890abcdef  ← SECRET EXPOSED
```

### The Solution

All output is automatically redacted **before** writing to log files using pattern matching for common secret formats.

**After:**
```bash
$ ai bash -c 'echo "API_KEY=sk-1234567890abcdef"'
API_KEY=sk-1234567890abcdef  ← User sees real output in terminal

# Log file contains:
API_KEY=[REDACTED]  ← Secret protected
```

### Redacted Patterns

The system detects and redacts **15+ secret patterns**:

1. **Generic API Keys & Tokens**
   - Pattern: `api_key=xxx`, `token=xxx`, `secret=xxx`, `password=xxx`
   - Example: `API_KEY=abc123` → `API_KEY=[REDACTED]`

2. **AWS Credentials**
   - Pattern: `AKIA[0-9A-Z]{16}`, `aws_secret_access_key=xxx`
   - Example: `AKIAIOSFODNN7EXAMPLE` → `AKIA[REDACTED]`

3. **GitHub Tokens**
   - Pattern: `ghp_xxx`, `ghs_xxx`
   - Example: `ghp_1234567890abcdef` → `ghp_[REDACTED]`

4. **Stripe Keys**
   - Pattern: `sk_live_xxx`, `pk_live_xxx`
   - Example: `sk_live_abc123` → `sk_live_[REDACTED]`

5. **OpenAI API Keys**
   - Pattern: `sk-[a-zA-Z0-9]{48}`
   - Example: `sk-1234567890abcdef` → `sk-[REDACTED]`

6. **Anthropic API Keys**
   - Pattern: `sk-ant-xxx`
   - Example: `sk-ant-api03-xxx` → `sk-ant-[REDACTED]`

7. **Bearer Tokens**
   - Pattern: `Bearer xxx`
   - Example: `Bearer abc123` → `Bearer [REDACTED]`

8. **JWT Tokens**
   - Pattern: `eyJxxx.eyJxxx.xxx`
   - Example: `eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.xxx` → `eyJ[REDACTED_JWT]`

9. **Database Connection Strings**
   - Pattern: `mongodb://user:pass@host`, `postgres://user:pass@host`
   - Example: `mongodb://admin:secret123@localhost` → `mongodb://[USER]:[REDACTED]@localhost`

10. **Generic Passwords in URLs**
    - Pattern: `://user:password@host`
    - Example: `https://admin:pass123@example.com` → `https://[USER]:[REDACTED]@example.com`

11. **PEM Private Keys**
    - Pattern: `-----BEGIN PRIVATE KEY-----...-----END PRIVATE KEY-----`
    - Result: Key body replaced with `[REDACTED]`

12. **OpenSSH Private Keys**
    - Pattern: `-----BEGIN OPENSSH PRIVATE KEY-----...-----END OPENSSH PRIVATE KEY-----`
    - Result: Key body replaced with `[REDACTED]`

13. **Environment Variable Exports**
    - Pattern: `export SECRET_KEY=xxx`
    - Example: `export API_TOKEN=abc123` → `export API_TOKEN=[REDACTED]`

14. **Credit Card Numbers**
    - Pattern: `1234 5678 9012 3456`
    - Example: `4532 1234 5678 9010` → `XXXX-XXXX-XXXX-[REDACTED]`

15. **Generic Long Secrets (16+ chars)**
    - Pattern: `password=xxx` (where xxx is 16+ characters)
    - Example: `password=verylongsecret123` → `password=[REDACTED]`

### Implementation Details

**File:** [src/redact-secrets.ts](src/redact-secrets.ts)

**Key Functions:**
- `redactSecrets(text: string): string` - Applies all redaction patterns
- `containsSecrets(text: string): boolean` - Checks if text has potential secrets

**Integration Points:**
- [src/wrapper.ts:64](src/wrapper.ts#L64) - Stdout redaction
- [src/wrapper.ts:79](src/wrapper.ts#L79) - Stderr redaction

**Important Notes:**
- ✅ User's terminal shows **real output** (no redaction)
- ✅ Only **log files** contain redacted text
- ✅ Redaction happens **before** writing to disk
- ✅ Secrets **never** reach AI models or external services

### Security Guarantees

1. **Defense in Depth**: Multiple pattern types catch various secret formats
2. **Terminal Privacy**: Users see real output; only logs are protected
3. **Pre-Storage Protection**: Redaction occurs before writing to disk
4. **AI Safety**: Secrets cannot be sent to LLMs via log reading
5. **Extensibility**: Easy to add new patterns in [src/redact-secrets.ts](src/redact-secrets.ts)

### Testing Secret Redaction

```bash
# Test with a fake API key
ai bash -c 'echo "API_KEY=sk-1234567890abcdef1234567890abcdef"'

# Terminal shows:
# API_KEY=sk-1234567890abcdef1234567890abcdef

# Check the log:
ai --last 5

# Log shows:
# API_KEY=[REDACTED]
```

---

## 3. How They Work Together

### Flow Diagram

```
┌─────────────────────────────────────────┐
│ User runs: ai npm run dev               │
└─────────────────────────────────────────┘
                    │
                    ▼
┌─────────────────────────────────────────┐
│ 1. Generate Session ID                  │
│    ID: 20251122173532-a83c               │
│    File: ~/.mcp-logs/session-*.log      │
└─────────────────────────────────────────┘
                    │
                    ▼
┌─────────────────────────────────────────┐
│ 2. Register in Master Index              │
│    [timestamp] [session-id] command      │
└─────────────────────────────────────────┘
                    │
                    ▼
┌─────────────────────────────────────────┐
│ 3. Execute Command                       │
│    stdout/stderr → terminal (colors)     │
└─────────────────────────────────────────┘
                    │
                    ▼
┌─────────────────────────────────────────┐
│ 4. Clean Output                          │
│    Strip ANSI color codes                │
└─────────────────────────────────────────┘
                    │
                    ▼
┌─────────────────────────────────────────┐
│ 5. Redact Secrets                        │
│    Apply all secret patterns             │
│    API_KEY=sk-xxx → API_KEY=[REDACTED]   │
└─────────────────────────────────────────┘
                    │
                    ▼
┌─────────────────────────────────────────┐
│ 6. Write to Session Log                  │
│    ~/.mcp-logs/session-{ID}.log          │
└─────────────────────────────────────────┘
                    │
                    ▼
┌─────────────────────────────────────────┐
│ 7. AI Access                             │
│    • ai --last → reads multiple sessions │
│    • MCP tools → filtered/full access    │
└─────────────────────────────────────────┘
```

---

## 4. Adding New Secret Patterns

To add a new secret pattern:

1. **Edit** [src/redact-secrets.ts](src/redact-secrets.ts)
2. **Add** to the `SECRET_PATTERNS` array:

```typescript
{
  pattern: /your-regex-pattern/gi,
  replacement: () => 'REPLACEMENT_TEXT'
}
```

3. **Rebuild**:
```bash
npm run build
```

4. **Test**:
```bash
ai bash -c 'echo "YOUR_SECRET=test123"'
ai --last 5  # Verify redaction
```

### Example: Adding Slack Token Pattern

```typescript
// In src/redact-secrets.ts, add to SECRET_PATTERNS:
{
  pattern: /xoxb-[0-9]{11,}-[0-9]{11,}-[a-zA-Z0-9]{24}/g,
  replacement: () => 'xoxb-[REDACTED]'
}
```

---

## 5. File Structure

```
~/.mcp-logs/
├── master-index.log                    # All sessions chronologically
├── session-20251122173532-2e39.log    # Session 1
├── session-20251122173532-a83c.log    # Session 2
├── session-20251122173533-a889.log    # Session 3
└── ...
```

**Master Index:**
```
[2025-11-22T17:35:32.814Z] [20251122173532-2e39] bash -c echo "test"
[2025-11-22T17:35:32.924Z] [20251122173532-a83c] bash -c for i in {1..5}; do echo "$i"; done
[2025-11-22T17:35:33.027Z] [20251122173533-a889] npm test
```

**Individual Session Log:**
```
================================================================================
[2025-11-22T17:35:32.924Z] Session: 20251122173532-a83c
[2025-11-22T17:35:32.924Z] Command: bash -c for i in {1..5}; do echo "$i"; done
================================================================================
1
2
3
4
5

[2025-11-22T17:35:33.968Z] Process exited with code: 0
```

---

## 6. Performance Considerations

### Session Isolation

- **Overhead**: ~1ms per command (session ID generation)
- **Disk Usage**: One file per command execution
- **Read Performance**: Intelligent aggregation reads only recent files

### Secret Redaction

- **Overhead**: <1ms per line of output
- **CPU Usage**: Minimal (regex pattern matching)
- **Memory**: Negligible (streaming line-by-line)

### Optimization Features

1. **Streaming Processing**: Output is redacted line-by-line, not buffered
2. **Limited File Reads**: `readRecentLogs()` only reads the N most recent sessions
3. **Efficient Patterns**: Regex patterns are pre-compiled
4. **Session Pruning**: Users can manually clean old sessions with `rm ~/.mcp-logs/session-*.log`

---

## 7. Limitations & Future Improvements

### Current Limitations

1. **Pattern-Based Detection**: Only catches known secret formats
   - Custom secret formats may not be detected
   - Base64-encoded secrets not automatically detected

2. **No Retroactive Redaction**: Existing logs are not retroactively redacted
   - Only new commands get secret redaction

3. **Manual Session Cleanup**: No automatic log rotation
   - Old session files accumulate indefinitely

### Future Improvements

1. **Auto Log Rotation**
   - Automatically delete sessions older than N days
   - Configurable retention policy

2. **Enhanced Secret Detection**
   - Machine learning-based secret detection
   - Base64/hex-encoded secret detection
   - Entropy-based detection for random tokens

3. **Session Compression**
   - Compress old session files to save disk space
   - Keep only recent sessions uncompressed

4. **Audit Logging**
   - Track which secrets were redacted
   - Alert on high-entropy outputs

---

## 8. Testing

### Manual Testing

```bash
# Test secret redaction
ai bash -c 'echo "API_KEY=sk-1234567890abcdef1234567890abcdef"'
ai --last 5  # Should show [REDACTED]

# Test session isolation
ai bash -c 'for i in {1..5}; do echo "CMD1: $i"; sleep 0.2; done' &
ai bash -c 'for l in A B C D E; do echo "CMD2: $l"; sleep 0.2; done' &
wait
ls ~/.mcp-logs/session-*.log  # Should show 2 separate files
```

### Automated Testing

Run the included test script:

```bash
npm run build
./test-fixes.sh
```

**Expected output:**
- ✅ Secret redaction working!
- ✅ Session isolation working!
- ✅ Each command has its own log file
- ✅ Outputs are NOT interleaved

---

## 9. Security Best Practices

### For Users

1. **Never commit `.mcp-logs/`**: Add to `.gitignore`
2. **Review logs before sharing**: Even with redaction, review logs before sharing
3. **Rotate logs regularly**: Clean old sessions: `rm ~/.mcp-logs/session-*.log`
4. **Report missing patterns**: If a secret isn't redacted, report it
5. **Use with caution in production**: This tool is for local development

### For Contributors

1. **Test new patterns**: Add test cases for new secret patterns
2. **Avoid false positives**: Ensure patterns don't redact non-secrets
3. **Document patterns**: Clearly document what each pattern matches
4. **Performance testing**: Ensure new patterns don't slow down execution
5. **Security review**: Review changes to redaction logic carefully

---

## 10. Troubleshooting

### Secrets Not Being Redacted

**Problem**: Secrets appear in logs unredacted

**Solutions**:
1. Check if the secret matches existing patterns in [src/redact-secrets.ts](src/redact-secrets.ts)
2. Add a new pattern if needed
3. Rebuild: `npm run build`
4. Test: `ai bash -c 'echo "SECRET=test"'`

### Interleaved Outputs Still Occurring

**Problem**: Concurrent commands have mixed outputs

**Solutions**:
1. Verify both commands use `ai` wrapper
2. Check session files: `ls ~/.mcp-logs/session-*.log`
3. Verify each command created its own session ID
4. Check master index: `cat ~/.mcp-logs/master-index.log`

### Too Many Session Files

**Problem**: Disk space filling up with old sessions

**Solution**: Clean old sessions periodically
```bash
# Delete sessions older than 7 days
find ~/.mcp-logs -name "session-*.log" -mtime +7 -delete

# Keep only last 50 sessions
ls -t ~/.mcp-logs/session-*.log | tail -n +51 | xargs rm
```

---

## 11. Contributing

If you discover a secret pattern that isn't being redacted:

1. **Report it**: Open an issue with the pattern (not the actual secret!)
2. **Submit a PR**: Add the pattern to [src/redact-secrets.ts](src/redact-secrets.ts)
3. **Add tests**: Include test cases in your PR
4. **Update docs**: Document the new pattern in this file

**Template for new patterns:**
```typescript
{
  pattern: /your-pattern-here/gi,
  replacement: () => 'REPLACEMENT_TEXT',
  // Add comment explaining what this pattern matches
  // Example: // Matches Slack bot tokens (xoxb-...)
}
```

---

## License

MIT License - See [LICENSE](LICENSE) for details.