# hazo_llm_api

A wrapper package for calling different LLMs with built-in prompt management and variable substitution.

## Overview

`hazo_llm_api` provides specialized functions for different LLM input/output combinations:

| Function | Input | Output | Use Case |
|----------|-------|--------|----------|
| `hazo_llm_text_text` | Text | Text | Standard text generation, Q&A, summarization |
| `hazo_llm_image_text` | Image | Text | Image analysis, OCR, object detection |
| `hazo_llm_text_image` | Text | Image | Image generation from descriptions |
| `hazo_llm_image_image` | Image(s) | Image | Image editing, combining, transformation |
| `hazo_llm_document_text` | PDF/Document | Text | PDF analysis, document extraction |
| `hazo_llm_text_image_text` | Text × 2 | Image + Text | Generate image then analyze it (chained) |
| `hazo_llm_image_image_text` | Images + Prompts | Image + Text | Chain image transformations then describe (chained) |
| `hazo_llm_prompt_chain` | Chain Definition | Merged Results | Execute multiple prompts with dynamic value resolution |
| `hazo_llm_dynamic_data_extract` | Initial Prompt | Merged Results | Dynamic chain where next prompt is determined by JSON output |

**Features:**
- **Multi-Provider Support**: Use Gemini, Qwen, Anthropic, OpenAI, DeepSeek, or add your own LLM providers
- **Cascade Fallback**: Automatically retry across providers on failure with configurable cascade
- **Observability**: Per-call usage metrics (tokens, cost, latency) with automatic log table and query API
- **Cost Cap**: Opt-in per-session spend limit with block or warn modes
- **Embeddings**: Single and batch text embeddings via OpenAI with LRU/persistent cache
- **Admin UI**: `LLMCostDashboard` and `LLMCallInspector` React components for spend monitoring
- **Prompt Management**: Store and retrieve prompts from a SQLite database with LRU caching
- **Variable Substitution**: Replace `$variables` in prompts with dynamic values
- **Multi-modal Support**: Handle text, images, and documents (PDFs) seamlessly
- **Extensible Architecture**: Provider-based design with simple registration system
- **Type-Safe**: Full TypeScript support with comprehensive type definitions
- **Auto-Initialization**: Database and log table initialize automatically on first call
- **Flexible Logging**: Pass any compatible logger (e.g., `hazo_logs`, Winston, or custom)
- **Lifecycle Hooks**: Monitor, log, and analyze LLM calls with beforeRequest, afterResponse, and onError hooks
- **Streaming Support**: Real-time streaming responses for text generation
- **Prompt Chaining**: Execute multiple prompts with dynamic value resolution from previous responses
- **Dynamic Data Extraction**: Build conditional prompt chains based on JSON output
- **Reusable PromptEditor Component**: Pre-built React component for managing prompts with full CRUD operations
- **hazo_connect Abstraction**: Unified interface for prompt data operations (REST API or direct DB)

## Installation

```bash
npm install hazo_llm_api hazo_core
```

`hazo_core` is a required peer dependency. Optional peers add extra capabilities: `hazo_logs` (structured logging), `hazo_debug` (LLM debug callback), `hazo_jobs` (log purge job), and `hazo_ui` (PromptEditor UI components).

## Quick Start

### 1. Set Up Configuration

Create `config/hazo_llm_api_config.ini` in your project:

```ini
[llm]
enabled_llms=["gemini", "qwen"]
primary_llm=gemini
sqlite_path=prompt_library.sqlite

[llm_gemini]
api_url=https://generativelanguage.googleapis.com/v1/models/gemini-2.5-flash:generateContent
api_url_image=https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-image:generateContent
capabilities=["text_text", "image_text", "text_image", "image_image"]
image_temperature=0.1
```

Create `.env.local` with your API keys:

```bash
GEMINI_API_KEY=your_api_key_here
QWEN_API_KEY=your_qwen_api_key_here
```

### 2. Initialize the LLM API

```typescript
import {
  initialize_llm_api,
  hazo_llm_text_text,
  hazo_llm_image_text,
  hazo_llm_text_image,
  hazo_llm_image_image,
} from 'hazo_llm_api/server';

// Option A: Use hazo_logs package (npm install hazo_logs)
import { createLogger } from 'hazo_logs';
const hazoLogger = createLogger('my_app');
const logger = {
  error: (msg, meta) => hazoLogger.error(msg, meta),
  info: (msg, meta) => hazoLogger.info(msg, meta),
  warn: (msg, meta) => hazoLogger.warn(msg, meta),
  debug: (msg, meta) => hazoLogger.debug(msg, meta),
};

// Option B: Use a simple console logger
// const logger = {
//   error: console.error,
//   info: console.log,
//   warn: console.warn,
//   debug: console.debug,
// };

// Initialize - reads config from config/hazo_llm_api_config.ini
await initialize_llm_api({ logger });
```

### 3. Text → Text (Standard Generation)

```typescript
const response = await hazo_llm_text_text({
  prompt: 'Explain quantum computing in simple terms.',
});

if (response.success) {
  console.log(response.text);
}
```

### 4. Image → Text (Image Analysis)

```typescript
const response = await hazo_llm_image_text({
  prompt: 'Describe what you see in this image.',
  image_b64: 'base64_encoded_image_string...',
  image_mime_type: 'image/jpeg',
});

if (response.success) {
  console.log(response.text);
}
```

### 5. Text → Image (Image Generation)

```typescript
const response = await hazo_llm_text_image({
  prompt: 'A serene mountain landscape at sunset',
});

if (response.success && response.image_b64) {
  // Use the generated image
  const image_src = `data:${response.image_mime_type};base64,${response.image_b64}`;
}
```

### 6. Image → Image (Single Image Transformation)

```typescript
const response = await hazo_llm_image_image({
  prompt: 'Convert this image to a watercolor painting style',
  image_b64: 'base64_encoded_image_string...',
  image_mime_type: 'image/jpeg',
});

if (response.success && response.image_b64) {
  // Use the transformed image
}
```

### 7. Multiple Images → Image (Combine Images)

```typescript
const response = await hazo_llm_image_image({
  prompt: 'Combine these two images into one cohesive creative image',
  images: [
    { data: 'base64_image_1...', mime_type: 'image/jpeg' },
    { data: 'base64_image_2...', mime_type: 'image/png' },
  ],
});

if (response.success && response.image_b64) {
  // Use the combined image
}
```

### 8. Text → Image → Text (Chained)

```typescript
const response = await hazo_llm_text_image_text({
  prompt_image: 'A serene Japanese garden with a koi pond',
  prompt_text: 'Describe the mood and elements of this image in detail.',
});

if (response.success) {
  // response.image_b64 - the generated image
  // response.text - the analysis of the generated image
}
```

### 9. Images → Image → Text (Multi-Step Chain)

```typescript
const response = await hazo_llm_image_image_text({
  // Minimum 2 images required
  images: [
    { image_b64: 'base64_image_1...', image_mime_type: 'image/jpeg' },
    { image_b64: 'base64_image_2...', image_mime_type: 'image/png' },
    { image_b64: 'base64_image_3...', image_mime_type: 'image/jpeg' },
  ],
  // Number of prompts = number of images - 1
  prompts: [
    'Combine these two images into a surreal landscape',  // Combines image 1 + 2
    'Add elements from this third image to the result',   // Combines result + image 3
  ],
  description_prompt: 'Describe this final artistic composition in detail.',
});

if (response.success) {
  // response.image_b64 - the final chained image
  // response.text - the description of the result
}
```

**Flow:**
1. Step 1: `images[0]` + `images[1]` + `prompts[0]` → result_1
2. Step 2: result_1 + `images[2]` + `prompts[1]` → result_2
3. ... continue for more images
4. Final: last result + `description_prompt` → text output

### 10. Document → Text (PDF Analysis)

```typescript
import { hazo_llm_document_text } from 'hazo_llm_api/server';
import * as fs from 'fs';

// Read PDF file as base64
const pdf_buffer = fs.readFileSync('./document.pdf');
const pdf_b64 = pdf_buffer.toString('base64');

const response = await hazo_llm_document_text({
  prompt: 'Extract all key information from this invoice including date, amount, and line items.',
  document_b64: pdf_b64,
  document_mime_type: 'application/pdf',
  max_pages: 10, // Optional: limit pages for large documents
});

if (response.success) {
  console.log(response.text);
}

// Multi-document: analyze multiple files collectively in a single LLM call
const receipt1 = fs.readFileSync('./receipt1.pdf');
const receipt2 = fs.readFileSync('./receipt2.pdf');

const multi_response = await hazo_llm_document_text({
  prompt: 'Validate that these receipts total $430 and match the expense claim.',
  document_b64: receipt1.toString('base64'),
  document_mime_type: 'application/pdf',
  additional_documents: [{
    mime_type: 'application/pdf',
    data: receipt2.toString('base64'),
  }],
});
```

### 11. Streaming Responses

```typescript
import { hazo_llm_text_text_stream } from 'hazo_llm_api/server';

const stream = await hazo_llm_text_text_stream({
  prompt: 'Write a detailed explanation of machine learning.',
});

for await (const chunk of stream) {
  if (chunk.error) {
    console.error(chunk.error);
    break;
  }
  process.stdout.write(chunk.text);
  if (chunk.done) break;
}
```

### 12. Lifecycle Hooks

Monitor, log, and analyze all LLM calls. Hooks fire for every service function (`text_text`, `image_text`, `text_image`, `image_image`, `document_text`) and streaming variants:

```typescript
await initialize_llm_api({
  logger,
  hooks: {
    beforeRequest: (ctx) => {
      console.log(`Starting ${ctx.service_type} call to ${ctx.provider}`);
      console.log('Params:', ctx.params);
    },
    afterResponse: (ctx) => {
      console.log(`[${ctx.provider}] ${ctx.service_type} completed in ${ctx.duration_ms}ms`);
      console.log('Prompt:', ctx.params.prompt);
      console.log('Response:', ctx.response.text);
    },
    onError: (ctx) => {
      console.error(`[${ctx.provider}] ${ctx.error.code}: ${ctx.error.message}`);
    },
  },
});
```

#### Integration with hazo_debug

If you use `hazo_debug`, wire its LLM debug callback into the hooks:

```typescript
import { create_llm_debug_callback } from 'hazo_debug/server';

const debug_callback = create_llm_debug_callback();

await initialize_llm_api({
  logger,
  hooks: {
    afterResponse: (ctx) => {
      debug_callback({
        service_type: ctx.service_type,
        provider: ctx.provider,
        prompt_text: String(ctx.params.prompt || ''),
        prompt_area: ctx.params.prompt_area as string | undefined,
        prompt_key: ctx.params.prompt_key as string | undefined,
        response_text: ctx.response.text,
        duration_ms: ctx.duration_ms,
        success: ctx.response.success,
      });
    },
    onError: (ctx) => {
      debug_callback({
        service_type: ctx.service_type,
        provider: ctx.provider,
        prompt_text: String(ctx.params.prompt || ''),
        duration_ms: ctx.duration_ms,
        success: false,
        error: ctx.error.message,
        error_code: ctx.error.code,
      });
    },
  },
});
```

### 13. Prompt Chaining

Execute multiple prompts where each can reference results from previous calls:

```typescript
import { hazo_llm_prompt_chain } from 'hazo_llm_api/server';

const response = await hazo_llm_prompt_chain({
  chain_calls: [
    {
      // First call: classify the document
      prompt_area: { match_type: 'direct', value: 'classifier' },
      prompt_key: { match_type: 'direct', value: 'document_type' },
    },
    {
      // Second call: use the classification result to pick the right extraction prompt
      prompt_area: { match_type: 'direct', value: 'extractor' },
      prompt_key: { match_type: 'call_chain', value: 'call[0].document_type' },
      variables: [
        { match_type: 'call_chain', value: 'call[0].raw_text', variable_name: 'previous_analysis' },
      ],
    },
  ],
  continue_on_error: true,
});

console.log(response.merged_result); // Deep-merged results from all calls
```

### 14. Dynamic Data Extraction

Build conditional prompt chains where the next prompt is determined by the current output:

```typescript
import { hazo_llm_dynamic_data_extract } from 'hazo_llm_api/server';

const response = await hazo_llm_dynamic_data_extract({
  initial_prompt_area: 'document',
  initial_prompt_key: 'identify_type',
  initial_prompt_variables: [{ content: document_text }],
  max_depth: 5, // Maximum chain length
  continue_on_error: false,
  context_data: { source: 'email_attachment' },
});

if (response.success) {
  // Access individual step results (unmerged)
  for (const unit of response.unit_results) {
    console.log(`Step ${unit.step}:`, unit.data);
  }

  // Access smart-merged data from all steps
  console.log('Merged data:', response.merged_result);
  console.log(`Completed ${response.successful_steps} steps`);
  console.log('Stop reason:', response.final_stop_reason);
}
```

The chain continues until:
- A prompt has no `next_prompt` configuration
- `max_depth` is reached
- An error occurs (if `continue_on_error` is false)
- The resolved next prompt doesn't exist

## API Reference

### `initialize_llm_api(config: LLMApiConfig): Promise<LLMApiClient>`

Initialize the LLM API. Must be called before using any other functions.

Configuration is read from `config/hazo_llm_api_config.ini` file.

#### LLMApiConfig

| Property | Type | Required | Default | Description |
|----------|------|----------|---------|-------------|
| `logger` | Logger | Yes | - | Winston-compatible logger instance |
| `sqlite_path` | string | No | From config file | Path to SQLite database |
| `api_url` | string | No | - | Legacy: API endpoint URL (deprecated, use config file) |
| `api_url_image` | string | No | - | Legacy: Image API endpoint (deprecated, use config file) |
| `api_key` | string | No | - | Legacy: API key (deprecated, use .env.local) |
| `llm_model` | string | No | From config file | Legacy: Provider name (deprecated, use config file) |

**Note:** The config file approach is recommended over passing configuration directly. This keeps sensitive API keys out of your codebase.

---

### `hazo_llm_text_text(params: TextTextParams): Promise<LLMResponse>`

Text input → Text output. Standard text generation.

#### TextTextParams

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `prompt` | string | Yes | The prompt text |
| `prompt_variables` | PromptVariables | No | Variables to substitute |
| `prompt_area` | string | No | Area for dynamic prompt lookup |
| `prompt_key` | string | No | Key for dynamic prompt lookup |

---

### `hazo_llm_image_text(params: ImageTextParams): Promise<LLMResponse>`

Image input → Text output. Analyze an image and get text description.

#### ImageTextParams

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `prompt` | string | Yes | Instructions for analyzing the image |
| `image_b64` | string | Yes | Base64 encoded image data |
| `image_mime_type` | string | Yes | MIME type (e.g., "image/jpeg") |
| `prompt_variables` | PromptVariables | No | Variables to substitute |

---

### `hazo_llm_text_image(params: TextImageParams): Promise<LLMResponse>`

Text input → Image output. Generate an image from a text description.

#### TextImageParams

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `prompt` | string | Yes | Description of image to generate |
| `prompt_variables` | PromptVariables | No | Variables to substitute |

---

### `hazo_llm_image_image(params: ImageImageParams): Promise<LLMResponse>`

Image(s) input → Image output. Transform, edit, or combine images based on instructions.

Supports both single image and multiple images input.

#### ImageImageParams

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `prompt` | string | Yes | Transformation/combination instructions |
| `image_b64` | string | For single image | Base64 encoded input image |
| `image_mime_type` | string | For single image | MIME type of input image |
| `images` | Base64Data[] | For multiple images | Array of images to combine |
| `prompt_variables` | PromptVariables | No | Variables to substitute |

**Note:** Use either `image_b64`/`image_mime_type` for single image OR `images` array for multiple images.

---

### `hazo_llm_text_image_text(params: TextImageTextParams): Promise<LLMResponse>`

Text → Image → Text (Chained). Generate an image from one prompt, then analyze it with a second prompt.

This function chains two operations:
1. Generate an image using `prompt_image`
2. Analyze the generated image using `prompt_text`

Returns both the generated image and the analysis text.

#### TextImageTextParams

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `prompt_image` | string | Yes | Description of image to generate |
| `prompt_text` | string | Yes | Prompt for analyzing the generated image |
| `prompt_image_variables` | PromptVariables | No | Variables for image generation prompt |
| `prompt_text_variables` | PromptVariables | No | Variables for analysis prompt |

---

### `hazo_llm_image_image_text(params: ImageImageTextParams): Promise<LLMResponse>`

Images → Image → Text (Multi-Step Chained). Chain multiple image transformations, then describe the result.

**Flow:**
1. Combine `images[0]` + `images[1]` using `prompts[0]` → result_1
2. Combine result_1 + `images[2]` using `prompts[1]` → result_2
3. Continue for all images
4. Describe final result using `description_prompt` → text output

Returns both the final image and the description text.

#### ImageImageTextParams

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `images` | ChainImage[] | Yes | Array of images to chain (minimum 2) |
| `prompts` | string[] | Yes | Transformation prompts (length = images.length - 1) |
| `description_prompt` | string | Yes | Prompt for describing the final image |
| `description_prompt_variables` | PromptVariables | No | Variables for description prompt |

#### ChainImage

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `image_b64` | string | Yes | Base64 encoded image data |
| `image_mime_type` | string | Yes | MIME type (e.g., "image/jpeg") |

---

### `hazo_llm_document_text(params: DocumentTextParams): Promise<LLMResponse>`

Document/PDF input → Text output. Analyze documents and extract information.

#### DocumentTextParams

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `prompt` | string | Yes | Instructions for analyzing the document |
| `document_b64` | string | Yes | Base64 encoded document data |
| `document_mime_type` | string | Yes | MIME type (e.g., "application/pdf") |
| `prompt_variables` | PromptVariables | No | Variables to substitute |
| `max_pages` | number | No | Maximum pages to process (for large documents) |
| `additional_documents` | Base64Data[] | No | Additional documents for collective analysis in a single LLM call |

---

### `hazo_llm_prompt_chain(params: PromptChainParams): Promise<PromptChainResponse>`

Execute a chain of prompts with dynamic value resolution from previous calls.

#### PromptChainParams

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `chain_calls` | ChainCallDefinition[] | Yes | Array of chain call definitions |
| `continue_on_error` | boolean | No | Skip failed calls (default: true) |

#### ChainCallDefinition

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `call_type` | ServiceType | No | Service type (default: 'text_text') |
| `prompt_area` | ChainFieldDefinition | Yes | Prompt area (direct or call_chain) |
| `prompt_key` | ChainFieldDefinition | Yes | Prompt key (direct or call_chain) |
| `variables` | ChainVariableDefinition[] | No | Variables with dynamic resolution |
| `image_b64` | ChainFieldDefinition | No | Image data for image services |
| `image_mime_type` | ChainFieldDefinition | No | Image MIME type |

#### ChainFieldDefinition

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `match_type` | 'direct' \| 'call_chain' | Yes | Resolution method |
| `value` | string | Yes | Value or path (e.g., "call[0].field") |

#### PromptChainResponse

| Property | Type | Description |
|----------|------|-------------|
| `success` | boolean | True if at least one call succeeded |
| `merged_result` | object | Deep-merged results from all calls |
| `call_results` | ChainCallResult[] | Individual call results |
| `errors` | array | Errors encountered |
| `total_calls` | number | Total calls attempted |
| `successful_calls` | number | Number of successful calls |

---

### `hazo_llm_dynamic_data_extract(params: DynamicDataExtractParams): Promise<DynamicDataExtractResponse>`

Execute a dynamic chain where each prompt's `next_prompt` configuration determines the next step.

#### DynamicDataExtractParams

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `initial_prompt_area` | string | Yes | Starting prompt area |
| `initial_prompt_key` | string | Yes | Starting prompt key |
| `initial_prompt_variables` | PromptVariables | No | Variables for initial prompt |
| `image_b64` | string | No | Image for image_text first call |
| `image_mime_type` | string | No | Image MIME type |
| `max_depth` | number | No | Maximum chain depth (default: 10) |
| `continue_on_error` | boolean | No | Continue on error (default: false) |
| `context_data` | object | No | Additional context for variable substitution |

#### DynamicDataExtractResponse

| Property | Type | Description |
|----------|------|-------------|
| `success` | boolean | True if chain completed successfully |
| `merged_result` | object | Smart-merged results from all steps (flattens arrays, merges like fields) |
| `unit_results` | array | Individual parsed results from each step: `[{ step: "<area>_<key>", data: ... }]` |
| `step_results` | DynamicExtractStepResult[] | Individual step results with metadata |
| `errors` | array | Errors encountered |
| `total_steps` | number | Total steps executed |
| `successful_steps` | number | Number of successful steps |
| `final_stop_reason` | string | Why the chain stopped |

**Note on `unit_results` vs `merged_result`:**
- `unit_results`: Preserves each step's raw parsed output for granular access. Use when you need to access individual step data without merging.
- `merged_result`: Smart-merged data from all steps. Arrays of objects are flattened and their fields merged. Useful when steps produce complementary data (e.g., multiple extraction steps building a unified record).

**Stop Reasons:**
- `no_next_prompt` - Prompt has no next_prompt configured
- `max_depth` - Reached max_depth limit
- `error` - An error occurred
- `next_prompt_not_found` - Resolved next prompt doesn't exist

---

### LLMResponse

All functions return an `LLMResponse`:

```typescript
interface LLMResponse {
  success: boolean;           // Whether the call succeeded
  text?: string;              // Generated text response
  image_b64?: string;         // Generated image (base64)
  image_mime_type?: string;   // MIME type of generated image
  error?: string;             // Error message if failed
  error_info?: LLMError;      // Structured error (code, message, retryable)
  error_code?: LLMErrorCode;  // Shorthand alias for error_info.code
  raw_response?: unknown;     // Raw API response
  usage?: UsageInfo;          // Tokens, cost, latency, model, finish_reason, attempts
}
```

`UsageInfo` includes:
```typescript
interface UsageInfo {
  input_tokens?: number;
  output_tokens?: number;
  cost_usd?: number;
  latency_ms?: number;
  finish_reason?: FinishReason;
  model?: string;
  provider?: string;
  attempts?: UsageAttempt[];  // Failed cascade attempts before success
}
```

## Using Specific LLM Providers

By default, all functions use the `primary_llm` configured in your config file. You can override this per-call:

```typescript
// Use Gemini explicitly
const response1 = await hazo_llm_text_text(
  { prompt: 'Hello world' },
  'gemini'
);

// Use Qwen explicitly
const response2 = await hazo_llm_text_text(
  { prompt: 'Hello world' },
  'qwen'
);

// Use primary LLM (from config)
const response3 = await hazo_llm_text_text(
  { prompt: 'Hello world' }
);
```

### Provider Configuration

Each provider has its own section in `config/hazo_llm_api_config.ini`:

```ini
[llm_gemini]
api_url=https://generativelanguage.googleapis.com/v1/models/gemini-2.5-flash:generateContent
api_url_image=https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-image:generateContent
capabilities=["text_text", "image_text", "text_image", "image_image"]
text_temperature=0.7
image_temperature=0.1

[llm_qwen]
api_url=https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions
model_text_text=qwen-max
model_image_text=qwen-vl-max
capabilities=["text_text", "image_text"]
text_temperature=0.8
```

### Supported Providers

| Provider | Capabilities | Config Section | Env Var |
|----------|-------------|----------------|---------|
| Gemini | text_text, image_text, text_image, image_image, document_text | `[llm_gemini]` | `GEMINI_API_KEY` |
| Qwen | text_text, image_text, text_image, image_image | `[llm_qwen]` | `QWEN_API_KEY` |
| Anthropic | text_text, image_text, document_text, text_text_stream | `[llm_anthropic]` | `ANTHROPIC_API_KEY` |
| OpenAI | text_text, image_text, text_image, text_text_stream, embed | `[llm_openai]` | `OPENAI_API_KEY` |
| DeepSeek | text_text, text_text_stream | `[llm_deepseek]` | `DEEPSEEK_API_KEY` |
| Custom | Define your own | — | Implement LLMProvider interface |

**Provider config example** — add each enabled provider to your `config/hazo_llm_api_config.ini`:

```ini
[llm]
enabled_llms=["gemini", "anthropic", "openai", "deepseek"]
primary_llm=anthropic

[llm_anthropic]
api_url=https://api.anthropic.com/v1/messages
api_version=2023-06-01
model_text_text=claude-sonnet-4-6
model_image_text=claude-sonnet-4-6
model_document_text=claude-sonnet-4-6
text_max_tokens=8192

[llm_openai]
api_url=https://api.openai.com/v1/chat/completions
api_url_image=https://api.openai.com/v1/images/generations
api_url_embed=https://api.openai.com/v1/embeddings
model_text_text=gpt-4o
model_image_text=gpt-4o
model_text_image=gpt-image-1
model_embed=text-embedding-3-small

[llm_deepseek]
api_url=https://api.deepseek.com/v1/chat/completions
model_text_text=deepseek-chat
```

See `TECHDOC.md` for instructions on adding custom providers.

## Prompt Management

### Database Schema

Prompts are stored in a SQLite database with the following table:

**Table: `prompts_library`**

| Column | Type | Description |
|--------|------|-------------|
| `uuid` | TEXT | Unique identifier |
| `prompt_area` | TEXT | Category/area for the prompt |
| `prompt_key` | TEXT | Unique key within the area |
| `prompt_name` | TEXT | Human-readable title/name for the prompt |
| `prompt_text_head` | TEXT | Optional prefix text prepended to the main prompt |
| `prompt_text_body` | TEXT | Main body of the prompt (required) |
| `prompt_text_tail` | TEXT | Optional suffix text appended to the main prompt |
| `prompt_variables` | TEXT | JSON array of expected variables |
| `prompt_notes` | TEXT | Documentation/notes |
| `created_at` | TEXT | Creation timestamp |
| `changed_by` | TEXT | Last update timestamp |

**Note:** The `prompt_text_full` field is a computed field that concatenates `prompt_text_head`, `prompt_text_body`, and `prompt_text_tail` with paragraph breaks (`\n\n`). It is not stored in the database but is available in `PromptRecord` objects.

### Variable Substitution

Variables in prompt text starting with `$` are automatically replaced. Variable substitution works across all prompt text fields (head, body, and tail):

```
Prompt Body: "Write about $topic in $style style."
Variables: [{ topic: "AI", style: "academic" }]
Result: "Write about AI in academic style."
```

**Example with Multiple Fields:**
```
Prompt Head: "Context: $context"
Prompt Body: "Write about $topic"
Prompt Tail: "Format: $format"
Variables: { context: "Academic", topic: "AI", format: "Essay" }
Result (Full): "Context: Academic\n\nWrite about AI\n\nFormat: Essay"
```

### Dynamic Prompts

Use `prompt_area` and `prompt_key` to fetch prompts from the database:

```typescript
const response = await hazo_llm_text_text({
  prompt: '', // Will be overridden by dynamic prompt
  prompt_area: 'marketing',
  prompt_key: 'product_description',
  prompt_variables: [{ product_name: 'Widget Pro' }],
});
```

### Prompt Import/Export Format

The test application supports bulk import/export of prompts via JSON files. This enables backup, migration, and sharing of prompt libraries.

#### Export Format

When exporting prompts, the JSON file follows this structure:

```json
{
  "version": "1.0",
  "exported_at": "2024-01-15T10:30:00.000Z",
  "prompts": [
    {
      "prompt_area": "marketing",
      "prompt_key": "greeting",
      "local_1": null,
      "local_2": null,
      "local_3": null,
      "user_id": null,
      "scope_id": null,
      "prompt_name": "Marketing Greeting Email",
      "prompt_text_head": "Dear valued customer,",
      "prompt_text_body": "Hello {{name}}, welcome to {{service}}.",
      "prompt_text_tail": "Best regards,\nThe Team",
      "prompt_variables": [
        { "name": "name", "description": "Customer name" },
        { "name": "service", "description": "Service name" }
      ],
      "prompt_notes": "Standard greeting for marketing emails"
    }
  ]
}
```

#### Required Fields for Import

| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `prompt_area` | string | Yes | Category/area for the prompt |
| `prompt_key` | string | Yes | Unique key within the area |
| `prompt_name` | string | Yes | Human-readable title/name for the prompt |
| `prompt_text_body` | string | Yes | Main body of the prompt template |
| `prompt_text_head` | string | No | Optional prefix text prepended to the prompt |
| `prompt_text_tail` | string | No | Optional suffix text appended to the prompt |
| `local_1` | string \| null | No | Local filter 1 (e.g., region) |
| `local_2` | string \| null | No | Local filter 2 (e.g., department) |
| `local_3` | string \| null | No | Local filter 3 (e.g., sub-category) |
| `user_id` | string \| null | No | User-specific identifier |
| `scope_id` | string \| null | No | Scope-specific identifier |
| `prompt_variables` | array | No | Array of `{ name, description }` objects |
| `prompt_notes` | string | No | Documentation/notes for the prompt |

#### Import Behavior

- Each imported prompt receives a new UUID automatically
- `created_at` is set to the import timestamp
- `changed_at` is set to the import timestamp
- Duplicate `prompt_area`/`prompt_key` combinations create new entries (no automatic deduplication)
- Invalid prompts (missing required fields) are skipped with errors reported

#### Bulk Operations UI

The test application's Prompt Configuration page (`/prompt-config`) provides a user interface for bulk operations:

**Selection:**
- Individual row selection via checkbox
- Select all/deselect all checkbox in table header
- Visual indication of selected rows

**Export:**
1. Select one or more prompts using checkboxes
2. Click "Export" button (Download icon)
3. JSON file downloads automatically with filename: `prompts_export_YYYY-MM-DD.json`

**Import:**
1. Click "Import" button (Upload icon)
2. Select a JSON file matching the export format
3. Prompts are validated and imported automatically
4. Success/error messages display import results

**Delete:**
1. Select one or more prompts using checkboxes
2. Click "Delete Selected" button
3. Confirm deletion in dialog
4. Selected prompts are removed permanently

#### API Endpoints

The test application provides bulk operation endpoints:

| Method | Endpoint | Description |
|--------|----------|-------------|
| `POST` | `/api/prompts/bulk` | Import prompts from JSON |
| `DELETE` | `/api/prompts/bulk` | Delete multiple prompts by ID |

**Import Request:**
```json
{
  "prompts": [
    {
      "prompt_area": "...",
      "prompt_key": "...",
      "prompt_name": "...",
      "prompt_text_body": "...",
      "prompt_text_head": "...",
      "prompt_text_tail": "..."
    }
  ]
}
```

**Import Response:**
```json
{
  "success": true,
  "imported_count": 5,
  "errors": ["Optional array of error messages for failed imports"]
}
```

**Delete Request:**
```json
{
  "ids": ["uuid-1", "uuid-2", "uuid-3"]
}
```

**Delete Response:**
```json
{
  "success": true,
  "deleted_count": 3,
  "errors": ["Optional array of error messages for failed deletions"]
}
```

## PromptEditor Component

The package includes a reusable React component for managing prompts with full CRUD operations, bulk import/export, and customizable UI.

### Basic Usage

```typescript
import { Layout, PromptEditor, create_rest_api_connect } from 'hazo_llm_api';

// Create a REST API connect instance pointing to your API endpoints
const connect = create_rest_api_connect('/api/prompts');

function PromptConfigPage() {
  return (
    <Layout sidebar={<YourSidebar />}>
      <PromptEditor connect={connect} />
    </Layout>
  );
}
```

### With Customization

```typescript
<PromptEditor
  connect={connect}
  customization={{
    title: 'My Custom Prompts',
    description: 'Manage prompts for my application',
    columns: {
      ownership: false,  // Hide user_id/scope_id column
      locals: false,     // Hide local filters column
    },
    features: {
      bulk_delete: false,  // Disable bulk delete
    },
    empty_message: 'No prompts yet. Create your first one!',
  }}
  callbacks={{
    on_create: (prompt) => console.log('Created:', prompt),
    on_update: (prompt) => console.log('Updated:', prompt),
    on_delete: (id) => console.log('Deleted:', id),
    on_error: (error) => console.error('Error:', error),
  }}
/>
```

### PromptEditor Features

- **CRUD Operations**: Create, read, update, delete prompts
- **Duplicate Prompts**: Clone existing prompts with auto-generated unique keys
- **Bulk Selection**: Select multiple prompts with checkboxes
- **Bulk Export**: Download selected prompts as JSON
- **Bulk Import**: Upload JSON file to import prompts
- **Bulk Delete**: Delete multiple selected prompts
- **Variable Management**: Add/edit/remove prompt variables
- **Next Prompt Configuration**: Set up dynamic prompt chaining
- **Customizable UI**: Control visible columns and enabled features
- **Callbacks**: React to create, update, delete, and error events

### Bundled Dependencies

The PromptEditor bundles its own UI components (based on shadcn/ui) so consuming apps only need:
- **Tailwind CSS** configured in the project
- **React 18+** as a peer dependency

### Tailwind v4 Setup (Required)

The PromptEditor component uses Tailwind CSS utility classes. If you're using **Tailwind v4**, you must add the following to your `globals.css` or main CSS file:

```css
@import "tailwindcss";

/* REQUIRED: Enable Tailwind to scan hazo_llm_api classes */
@source "../node_modules/hazo_llm_api/dist";
```

Without this directive, the PromptEditor styling will not work correctly. Symptoms include:
- Invisible or missing hover states
- Missing background colors and text colors
- Broken layout with flex/grid utilities
- Dialog/modal styling issues

**For Tailwind v3 users**, add hazo_llm_api to your content paths in `tailwind.config.ts`:

```typescript
export default {
  content: [
    // ... your other content paths
    "./node_modules/hazo_llm_api/dist/**/*.{js,ts,jsx,tsx}",
  ],
}
```

## hazo_connect Abstraction

The `hazo_connect` interface provides a unified way to access prompt data, whether from a REST API (client-side) or directly from the database (server-side).

### Client-Side: REST API Connect

```typescript
import { create_rest_api_connect, type HazoConnect } from 'hazo_llm_api';

// Create a connect instance for your API endpoints
const connect = create_rest_api_connect('/api/prompts');

// Use the connect instance
const { success, data } = await connect.get_all();
const { success, data } = await connect.get_by_id('uuid-here');
const { success, data } = await connect.get_by_area('marketing');
const { success, data } = await connect.get_by_area_key('marketing', 'greeting');

// Write operations
const { success, data } = await connect.create({
  prompt_area: 'marketing',
  prompt_key: 'greeting',
  prompt_text: 'Hello {{name}}!',
});

const { success, data } = await connect.update('uuid-here', {
  prompt_text: 'Updated text',
});

const { success } = await connect.delete('uuid-here');

// Bulk operations
const { success, count } = await connect.bulk_import([...prompts]);
const { success, count } = await connect.bulk_delete(['uuid1', 'uuid2']);
```

### Server-Side: Direct Database Connect

```typescript
import { create_direct_db_connect, get_database, default_logger } from 'hazo_llm_api/server';

// Create a connect instance with direct DB access
const connect = create_direct_db_connect(() => get_database(), default_logger);

// Same interface as REST API connect
const { success, data } = await connect.get_by_area_key('marketing', 'greeting');
```

### HazoConnect Interface

```typescript
interface HazoConnect {
  // Read
  get_all(): Promise<HazoConnectResponse<PromptRecord[]>>;
  get_by_id(id: string): Promise<HazoConnectResponse<PromptRecord>>;
  get_by_area(area: string): Promise<HazoConnectResponse<PromptRecord[]>>;
  get_by_area_key(area: string, key: string, locals?: LocalFilters): Promise<HazoConnectResponse<PromptRecord>>;

  // Write
  create(prompt: PromptInput): Promise<HazoConnectResponse<PromptRecord>>;
  update(id: string, updates: Partial<PromptInput>): Promise<HazoConnectResponse<PromptRecord>>;
  delete(id: string): Promise<HazoConnectResponse<void>>;

  // Bulk
  bulk_import(prompts: PromptInput[]): Promise<HazoConnectBulkResponse>;
  bulk_delete(ids: string[]): Promise<HazoConnectBulkResponse>;
}
```

### Custom Backend Implementation

You can implement the `HazoConnect` interface for custom backends:

```typescript
import type { HazoConnect } from 'hazo_llm_api';

const custom_connect: HazoConnect = {
  async get_all() {
    // Your custom implementation
    const data = await myCustomDb.getAllPrompts();
    return { success: true, data };
  },
  // ... implement other methods
};

// Use with PromptEditor
<PromptEditor connect={custom_connect} />
```

### External Backend for LLM Functions

You can also use a custom `HazoConnect` implementation for all LLM functions by passing it during initialization. This allows you to use external backends like PostgREST, Supabase, or any custom REST API instead of the default SQLite database:

```typescript
import { initialize_llm_api, type HazoConnect } from 'hazo_llm_api/server';

// Create your custom HazoConnect implementation
const postgrestConnect: HazoConnect = {
  async get_by_area_key(area, key, locals) {
    const response = await fetch(`${POSTGREST_URL}/hazo_prompts?prompt_area=eq.${area}&prompt_key=eq.${key}`);
    const data = await response.json();
    return { success: true, data: data[0] };
  },
  // ... implement other methods
};

// Initialize with your custom backend
await initialize_llm_api({
  logger: myLogger,
  hazo_connect: postgrestConnect,  // Skips SQLite initialization
});

// All LLM functions now use your custom backend for prompt retrieval
const response = await hazo_llm_text_text({
  prompt: '',
  prompt_area: 'marketing',
  prompt_key: 'greeting',
});
```

When `hazo_connect` is provided:
- SQLite database initialization is **skipped**
- The provided `HazoConnect` instance is used for all prompt operations
- All LLM functions work seamlessly with the external backend

## Server-Side Only

**Important**: All LLM API functions must be used server-side only.

```typescript
// ✅ Correct - Server-side import
import { hazo_llm_text_text } from 'hazo_llm_api/server';

// ❌ Wrong - Will fail on client-side
import { hazo_llm_text_text } from 'hazo_llm_api';
```

Use in:
- Next.js API routes
- Next.js Server Components
- Next.js Server Actions
- Node.js backend services

### Prompt Cache

Prompts are cached in an LRU cache (default: 100 entries, 5-minute TTL). The cache is automatically invalidated on all database writes (insert, update, delete, bulk operations).

For manual control:

```typescript
import {
  invalidate_prompt_cache,
  invalidate_prompt_cache_by_id,
  clear_prompt_cache,
  configure_prompt_cache,
} from 'hazo_llm_api/server';

// Invalidate a specific cached prompt
invalidate_prompt_cache('marketing', 'greeting');
invalidate_prompt_cache_by_id('some-uuid');

// Clear all cached prompts
clear_prompt_cache();

// Configure cache settings
configure_prompt_cache({ ttl_ms: 60000, max_size: 50 });
```

## Examples

### Chat/Q&A

```typescript
const response = await hazo_llm_text_text({
  prompt: `Answer this question: ${user_question}`,
});
```

### Document Summarization

```typescript
const response = await hazo_llm_text_text({
  prompt: `Summarize this document in 3 bullet points:\n\n${document_text}`,
});
```

### Image OCR

```typescript
const response = await hazo_llm_image_text({
  prompt: 'Extract all text from this image.',
  image_b64: document_image_base64,
  image_mime_type: 'image/png',
});
```

### Product Image Analysis

```typescript
const response = await hazo_llm_image_text({
  prompt: 'Describe this product and list its key features.',
  image_b64: product_image_base64,
  image_mime_type: 'image/jpeg',
});
```

## Error Handling

```typescript
const response = await hazo_llm_text_text({
  prompt: 'Hello world',
});

if (!response.success) {
  logger.error('LLM call failed', { error: response.error });
  // Handle error appropriately
}
```

## Configuration

### Logging

`hazo_llm_api` requires you to provide a logger instance that implements the `Logger` interface:

```typescript
interface Logger {
  error: (message: string, meta?: object) => void;
  info: (message: string, meta?: object) => void;
  warn: (message: string, meta?: object) => void;
  debug: (message: string, meta?: object) => void;
}
```

#### Option A: Use hazo_logs Package (Recommended)

Install the `hazo_logs` package for file-based logging with daily rotation:

```bash
npm install hazo_logs
```

```typescript
import { createLogger } from 'hazo_logs';
import { initialize_llm_api } from 'hazo_llm_api/server';

// Create logger instance
const hazoLogger = createLogger('my_app');

// Wrap to match Logger interface
const logger = {
  error: (msg, meta) => hazoLogger.error(msg, meta),
  info: (msg, meta) => hazoLogger.info(msg, meta),
  warn: (msg, meta) => hazoLogger.warn(msg, meta),
  debug: (msg, meta) => hazoLogger.debug(msg, meta),
};

// Pass to initialize_llm_api
await initialize_llm_api({ logger });
```

#### Console Output

The package produces minimal, focused logging. Initialization is silent. Each LLM API call logs:

1. **Input**: prompt_area/key and variables being injected
2. **Prompt**: before and after variable substitution (clearly separated)
3. **Result**: the LLM response text or error

```
[HAZO_LLM] ═══ text_text ═══
  Input: area="clarification" key="validate_response"
  Variables: {"location":"Tokyo"}

  ── Prompt (before substitution) ──
  Please validate {{location}}...

  ── Prompt (after substitution) ──
  Please validate Tokyo...

  ── Result ──
  The response is valid...
[HAZO_LLM] ═══ /text_text ═══
```

#### Option B: Use Console Logger

For simple use cases, use a console logger:

```typescript
const logger = {
  error: console.error,
  info: console.log,
  warn: console.warn,
  debug: console.debug,
};

await initialize_llm_api({ logger });
```

#### Option C: Use Any Compatible Logger

Any logger that implements the `Logger` interface works (Winston, Pino, etc.)

#### HazoLoggerConfig Options

| Option | Type | Default | Description |
|--------|------|---------|-------------|
| `log_file` | string | Required | Full path to log file |
| `log_level` | LogLevel | 'info' | Minimum log level (debug, info, warn, error) |
| `max_size` | string | '10m' | Max file size before rotation |
| `max_files` | number | 5 | Max rotated files to keep |
| `console_enabled` | boolean | true | Also log to console |

### Logger Interface

The logger must implement these methods (Winston-compatible):

```typescript
interface Logger {
  error: (message: string, meta?: Record<string, unknown>) => void;
  info: (message: string, meta?: Record<string, unknown>) => void;
  warn: (message: string, meta?: Record<string, unknown>) => void;
  debug: (message: string, meta?: Record<string, unknown>) => void;
}
```

## License

MIT

## Author

Pubs Abayasiri

## Migration Guide

### Upgrading to v1.2.0 - Prompt Text Field Changes

Version 1.2.0 introduces a breaking change to the prompt text structure. Follow these steps to migrate:

#### What Changed

**Before (v1.1.x):**
```typescript
{
  prompt_area: 'marketing',
  prompt_key: 'greeting',
  prompt_text: 'Dear Customer,\n\nHello {{name}}!\n\nBest regards',
}
```

**After (v1.2.0):**
```typescript
{
  prompt_area: 'marketing',
  prompt_key: 'greeting',
  prompt_name: 'Marketing Greeting',         // NEW: Required
  prompt_text_head: 'Dear Customer,',        // NEW: Optional prefix
  prompt_text_body: 'Hello {{name}}!',       // NEW: Main content (required)
  prompt_text_tail: 'Best regards',          // NEW: Optional suffix
  prompt_text_full: '...',                   // NEW: Computed field (read-only)
}
```

#### Automatic Database Migration

The database schema is automatically migrated when you upgrade:
- A new `prompt_name` column is added (defaults to concatenation of area + key)
- Existing `prompt_text` data is moved to `prompt_text_body`
- New `prompt_text_head` and `prompt_text_tail` columns are added (empty by default)
- Old `prompt_text` column is removed

#### Code Changes Required

1. **Update Prompt Creation:**
```typescript
// OLD
await insert_prompt({
  prompt_area: 'marketing',
  prompt_key: 'greeting',
  prompt_text: 'Hello {{name}}!',
});

// NEW
await insert_prompt({
  prompt_area: 'marketing',
  prompt_key: 'greeting',
  prompt_name: 'Marketing Greeting',
  prompt_text_body: 'Hello {{name}}!',
  prompt_text_head: 'Dear Customer,',  // Optional
  prompt_text_tail: 'Best regards',     // Optional
});
```

2. **Update Prompt Updates:**
```typescript
// OLD
await update_prompt('marketing', 'greeting', {
  prompt_text: 'Updated text',
});

// NEW
await update_prompt('marketing', 'greeting', {
  prompt_text_body: 'Updated text',    // Or update head/tail separately
});
```

3. **Update JSON Import/Export:**
Ensure your JSON files include the new fields:
```json
{
  "version": "1.0",
  "prompts": [
    {
      "prompt_area": "marketing",
      "prompt_key": "greeting",
      "prompt_name": "Marketing Greeting",
      "prompt_text_head": "Dear Customer,",
      "prompt_text_body": "Hello {{name}}!",
      "prompt_text_tail": "Best regards"
    }
  ]
}
```

#### Using prompt_text_full

The computed `prompt_text_full` field is available in all `PromptRecord` objects:

```typescript
const prompt = await get_prompt_by_area_and_key(db, 'marketing', 'greeting', logger);

// Access individual parts
console.log(prompt.prompt_text_head);  // "Dear Customer,"
console.log(prompt.prompt_text_body);  // "Hello {{name}}!"
console.log(prompt.prompt_text_tail);  // "Best regards"

// Access full concatenated text
console.log(prompt.prompt_text_full);  // "Dear Customer,\n\nHello {{name}}!\n\nBest regards"
```

#### UI Changes

The PromptEditor component now displays:
- A "Name" field for `prompt_name`
- Separate "Prefix", "Body", and "Suffix" fields
- Preview of the full prompt text

No code changes are required if you're using the `PromptEditor` component - it automatically supports the new structure.

## Observability

Every LLM call automatically writes one row to `hazo_llm_api_log` (SQLite or PostgreSQL — created on first call, no manual migration needed). The row contains provider, model, service_type, token counts, `cost_usd`, `latency_ms`, `session_id`, `reference`, `error_code`, and a `context_json` blob.

### Response Usage

```typescript
const response = await hazo_llm_text_text({ prompt: 'Hello' });

if (response.usage) {
  console.log(`Cost: $${response.usage.cost_usd?.toFixed(6)}`);
  console.log(`Tokens: ${response.usage.input_tokens}in / ${response.usage.output_tokens}out`);
  console.log(`Latency: ${response.usage.latency_ms}ms`);
}
```

### Querying the Log

```typescript
import { get_llm_usage_summary, get_llm_call_detail } from 'hazo_llm_api/server';

// Aggregate stats — group by provider over the last 7 days
const summary = await get_llm_usage_summary({
  from: '2026-05-18',
  to: '2026-05-25',
  group_by: 'provider',
  limit: 100,
});

// Single call drill-down
const detail = await get_llm_call_detail('some-uuid-here');
```

### Pricing Overrides and Retroactive Recalculation

```typescript
import { update_pricing, recompute_costs } from 'hazo_llm_api/server';

// Override pricing at runtime
update_pricing('anthropic/claude-sonnet-4-6', {
  kind: 'text',
  input_per_1m: 3.0,
  output_per_1m: 15.0,
});

// Retroactively recalculate costs after a pricing correction
const result = await recompute_costs({
  from: '2026-05-01',
  to: '2026-05-25',
  models: ['claude-sonnet-4-6'],
  dry_run: false,  // Set true to preview without writing
});
console.log(`Updated ${result.updated_count} rows`);
```

### Disabling the Log Writer

```typescript
await initialize_llm_api({
  logger,
  api_log: { enabled: false },  // No rows written to hazo_llm_api_log
});
```

### hazo_logs Integration

If `hazo_logs ^1.1.0` is installed, `session_id` and `reference` are read automatically from the async local storage context — no extra configuration required.

---

## Cascade Fallback

Pass `providers` on any call to define an ordered fallback list. The call tries each provider in sequence, stopping on the first success. Failed attempts are recorded in `response.usage.attempts`.

```typescript
// Try Anthropic first, fall back to OpenAI, then Gemini
const response = await hazo_llm_text_text(
  { prompt: 'Hello', providers: ['anthropic', 'openai', 'gemini'] },
);
```

### Default Cascade Config

Set a default cascade for all calls at init time:

```typescript
import { DEFAULT_CASCADE_ON_CODES } from 'hazo_llm_api/server';

await initialize_llm_api({
  logger,
  cascade: {
    providers: ['anthropic', 'openai'],
    on_codes: DEFAULT_CASCADE_ON_CODES, // RATE_LIMITED, NETWORK_ERROR, TIMEOUT
    timeout_ms_per_attempt: 30000,
  },
});
```

Per-call `providers` always overrides the init-time cascade config.

---

## Embeddings

Generate single or batch text embeddings (requires OpenAI provider):

```typescript
import { hazo_llm_embed } from 'hazo_llm_api/server';

// Single text
const result = await hazo_llm_embed({ text: 'Hello world' });
if (result.success) {
  console.log(result.vectors![0]);  // number[]
  console.log(`Dimensions: ${result.dimensions}`);
}

// Batch
const batch = await hazo_llm_embed({
  text: ['First sentence', 'Second sentence', 'Third sentence'],
  model: 'text-embedding-3-large',  // optional override
});
```

Repeated calls with the same text are served from an in-memory LRU cache. For persistent caching, provide a `Keyv` instance at init time:

```typescript
import Keyv from 'keyv';
import KeyvSqlite from '@keyv/sqlite';

await initialize_llm_api({
  logger,
  embed_cache: {
    max_size: 5000,
    keyv: new Keyv({ store: new KeyvSqlite('sqlite://embed_cache.sqlite') }),
  },
});
```

---

## Prompt Caching (Anthropic)

Pass `prompt_parts` instead of `prompt` to hint which parts should be cached at the provider level. Anthropic supports prompt caching via `cache_control`. Other providers concatenate parts as plain text.

```typescript
const response = await hazo_llm_text_text({
  prompt_parts: [
    { text: long_system_context, cache: true },   // Cached by Anthropic
    { text: user_message },                        // Not cached
  ],
});
```

---

## Cost Cap

Prevent runaway spend by configuring a per-session cost cap. The gate reads `SUM(cost_usd)` from `hazo_llm_api_log` before each call.

```typescript
await initialize_llm_api({
  logger,
  cost_cap: {
    enabled: true,
    window: 'session',          // 'session' | 'day' | 'hour'
    on_exceeded: 'block',       // 'block' | 'warn'
    get_user_cap: async ({ session_id }) => {
      // Return the USD cap for this session
      return await db.getUserSpendLimit(session_id);
    },
    cost_cap_exceeded: async ({ session_id, current_usd, cap_usd }) => {
      // Called only in 'warn' mode
      console.warn(`Session ${session_id} exceeded cap: $${current_usd} of $${cap_usd}`);
    },
  },
});
```

In `block` mode, calls that would exceed the cap return immediately with `error_code: 'COST_LIMIT_EXCEEDED'`. In `warn` mode, the `cost_cap_exceeded` hook fires and the call proceeds.

Requires `hazo_logs ^1.1.0` for `session_id` ALS propagation.

---

## Admin UI Components

Import dashboard components for monitoring spend and inspecting individual calls. Requires `hazo_ui ^2.17.0`.

### LLMCostDashboard

```typescript
import { LLMCostDashboard } from 'hazo_llm_api';
import type { FetchSummaryFn } from 'hazo_llm_api';

// Build the fetch function using your framework's API routes
const fetch_summary: FetchSummaryFn = async (opts) => {
  const params = new URLSearchParams({ ...opts });
  const res = await fetch(`/api/llm-summary?${params}`);
  return res.json();
};

function MonitoringPage() {
  return <LLMCostDashboard fetch_summary={fetch_summary} />;
}
```

The component provides date-range filters, group-by selector (`date` / `provider` / `model` / `service_type`), an SVG time-series chart (when `group_by=date`), and a summary table with totals.

### LLMCallInspector

```typescript
import { LLMCallInspector } from 'hazo_llm_api';
import type { FetchDetailFn } from 'hazo_llm_api';

const fetch_detail: FetchDetailFn = async (id) => {
  const res = await fetch(`/api/llm-call-detail?id=${id}`);
  return res.json();
};

function CallDetailPage({ call_id }: { call_id: string }) {
  return <LLMCallInspector callId={call_id} fetch_detail={fetch_detail} />;
}
```

---

## Maintenance

Register the log purge job with `hazo_jobs` to automatically delete old log rows:

```typescript
import { llm_api_purge_log_job } from 'hazo_llm_api/server';
import { register_job } from 'hazo_jobs/server';

// Register the job — hazo_jobs scheduler calls it on your configured schedule
register_job(llm_api_purge_log_job);

// Payload when triggering the job
// { retention_days: 90 }  — rows older than 90 days are deleted
```

The job requires `hazo_jobs ^0.12.0` as a peer dependency.

---

## Environment Variables

| Variable | Provider | Required |
|----------|----------|----------|
| `GEMINI_API_KEY` | Gemini | When `gemini` is in `enabled_llms` |
| `QWEN_API_KEY` | Qwen | When `qwen` is in `enabled_llms` |
| `ANTHROPIC_API_KEY` | Anthropic | When `anthropic` is in `enabled_llms` |
| `OPENAI_API_KEY` | OpenAI | When `openai` is in `enabled_llms` |
| `DEEPSEEK_API_KEY` | DeepSeek | When `deepseek` is in `enabled_llms` |

All keys go in `.env.local` (never committed to git).

---

## Support

For issues and questions, please visit the [GitHub Issues](https://github.com/pub12/hazo_llm_api/issues) page.