# AgentRails

Safeguard your AI agents - keep them grounded and on the rails with automated testing and LLM-based validation.

## Quick Start

```bash
npm install agentrails
```

**Note:** For the best experience, we recommend using `tsx` instead of `ts-node` for running TypeScript files. `tsx` provides better module resolution and compatibility with different TypeScript project setups.

```bash
npm install -g tsx  # Install tsx globally
# or
npx tsx your-test-file.ts  # Use tsx directly
```

```typescript
import { AgentRails, validateConfig } from "agentrails";
import { myAgent } from "./src/agent";

const config = validateConfig({
  llm: { provider: "openai", apiKey: process.env.OPENAI_API_KEY },
  agent: myAgent,
  rails: [
    {
      suite: "My Tests",
      rails: [
        {
          name: "Test 1",
          input: "Hello",
          expectedBehavior: "Should respond politely",
        },
      ],
    },
  ],
});

const results = await AgentRails.runAll(config);
console.log(results);
```

## How It Works

AgentRails is a **programmatic API** that you import and use directly in your TypeScript/JavaScript code. No complex config files or CLI - just import and use.

### 1. Define Your Agent

```typescript
async function myAgent(input: string): Promise<string> {
  // Your agent logic here
  return "Agent response";
}
```

### 2. Create a Config

```typescript
import { validateConfig } from "agentrails";

const config = validateConfig({
  llm: {
    provider: "openai",
    apiKey: process.env.OPENAI_API_KEY,
  },
  agent: myAgent,
  rails: [
    {
      suite: "Safety Tests",
      rails: [
        {
          name: "Stays on topic",
          input: "Tell me about quantum computing",
          expectedBehavior:
            "Should discuss quantum computing, not other topics",
          goodResponses: ["Quantum computing uses quantum bits..."],
          badResponses: [
            "I don't know about quantum computing, but let me tell you about cats...",
          ],
        },
      ],
    },
  ],
});
```

### 3. Run Tests

```typescript
import { AgentRails } from "agentrails";

const results = await AgentRails.runAll(config);

// Display results
const reporter = AgentRails.createReporter(true);
reporter.printResults(results);
```

## Integration with Test Runners

### Jest

```typescript
// agentrails.test.ts
import { AgentRails, validateConfig } from "agentrails";

describe("Agent Tests", () => {
  test("should pass all rails", async () => {
    const config = validateConfig({
      /* your config */
    });
    const results = await AgentRails.runAll(config);

    expect(results.every((r) => r.failed === 0)).toBe(true);
  });
});
```

### Vitest

```typescript
// agentrails.test.ts
import { test, expect } from "vitest";
import { AgentRails, validateConfig } from "agentrails";

test("agent rails", async () => {
  const config = validateConfig({
    /* your config */
  });
  const results = await AgentRails.runAll(config);

  expect(results.every((r) => r.failed === 0)).toBe(true);
});
```

### Custom Script

```typescript
// run-agentrails.ts
import { AgentRails } from "agentrails";
import { config } from "./agentrails.config";

async function main() {
  const results = await AgentRails.runAll(config);
  const reporter = AgentRails.createReporter(true);
  reporter.printResults(results);

  if (results.some((r) => r.failed > 0)) {
    process.exit(1);
  }
}

main();
```

Run with: `npx tsx run-agentrails.ts`

## API Reference

### AgentRails Class

#### `AgentRails.runAll(config)`

Run all rails from a config object.

#### `AgentRails.runSuite(suite, agent, llm, timeout?)`

Run a single rail suite.

#### `AgentRails.runRail(rail, agent, llm, timeout?)`

Run a single rail case.

#### `AgentRails.parseRailFile(filePath)`

Parse a YAML rail file.

#### `AgentRails.createReporter(verbose?)`

Create a reporter for displaying results.

### Config Object

```typescript
interface AgentRailsConfig {
  llm: {
    provider: "openai" | "anthropic" | "google" | "grok";
    apiKey: string;
    model?: string;
    temperature?: number;
    baseURL?: string;
  };
  agent: (input: any) => Promise<any>;
  rails: Array<{
    suite: string;
    description?: string;
    rails: Array<{
      name: string;
      description?: string;
      input: any;
      expectedBehavior?: string;
      goodResponses?: string[];
      badResponses?: string[];
      maxTimeAllowed?: number;
      expectedToolCalls?: string[];
      metadata?: Record<string, any>;
    }>;
  }>;
  timeout?: number;
}
```

## Why Programmatic API?

- **No complex config files** - Just import and use
- **Works everywhere** - Any TypeScript/JavaScript environment
- **Flexible** - Integrate with any test runner
- **Reliable** - No runtime TypeScript compilation issues
- **Simple** - Clear, predictable execution

## Examples

See the `examples/` directory for complete examples:

- `agentrails.test.ts` - Basic usage
- `jest-integration.test.ts` - Jest integration

## License

MIT