llm-polyglot

> Universal client for LLM providers with OpenAI-compatible interface

`llm-polyglot` extends the OpenAI SDK to provide a consistent interface across different LLM providers. Use the same familiar OpenAI-style API with Anthropic, Google, and others. ## Provider Support **Native API Support Status:** | Provider API | Status | Chat | Basic Stream | Functions/Tool calling | Function streaming | Notes | |-------------|---------|------|--------------|---------------------|-----------------|--------| | OpenAI | ✅ | ✅ | ✅ | ✅ | ✅ | Direct SDK proxy | | Anthropic | ✅ | ✅ | ✅ | ❌ | ❌ | Claude models | | Google | ✅ | ✅ | ✅ | ✅ | ❌ | Gemini models + context caching | | Azure | 🚧 | ✅ | ✅ | ❌ | ❌ | OpenAI model hosting | | Cohere | ❌ | - | - | - | - | Not supported | | AI21 | ❌ | - | - | - | - | Not supported | Stream Types: - **Basic Stream**: Simple text streaming - **Partial JSON Stream**: Progressive JSON object construction during streaming - **Function Stream**: Streaming function/tool calls and their results
**OpenAI-Compatible Hosting Providers:** These providers use the OpenAI SDK format, so they work directly with the OpenAI client configuration: | Provider | How to Use | Available Models | |----------|------------|------------------| | Together | Use OpenAI client with Together base URL | Mixtral, Llama, OpenChat, Yi, others | | Anyscale | Use OpenAI client with Anyscale base URL | Mistral, Llama, others | | Perplexity | Use OpenAI client with Perplexity base URL | pplx-* models | | Replicate | Use OpenAI client with Replicate base URL | Various open models | ## Installation ```bash # Base installation npm install llm-polyglot openai # Provider-specific SDKs (as needed) npm install @anthropic-ai/sdk # For Anthropic npm install @google/generative-ai # For Google/Gemini ``` ## Basic Usage ```typescript import { createLLMClient } from "llm-polyglot"; // Initialize provider-specific client const client = createLLMClient({ provider: "anthropic" // or "google", "openai", etc. }); // Use consistent OpenAI-style interface const completion = await client.chat.completions.create({ model: "claude-3-opus-20240229", messages: [{ role: "user", content: "Hello!" }], max_tokens: 1000 }); ``` ## Provider-Specific Features ### Anthropic The llm-polyglot library provides support for Anthropic's API, including standard chat completions, streaming chat completions, and function calling. Both input paramaters and responses match exactly those of the OpenAI SDK - for more detailed documentation please see the OpenAI docs: [https://platform.openai.com/docs/api-reference](https://platform.openai.com/docs/api-reference) The anthropic sdk is required when using the anthropic provider - we only use the types provided by the sdk. ```bash bun add @anthropic-ai/sdk ``` ```typescript const client = createLLMClient({ provider: "anthropic" }); // Standard completion const response = await client.chat.completions.create({ model: "claude-3-opus-20240229", messages: [{ role: "user", content: "Hello!" }] }); // Streaming const stream = await client.chat.completions.create({ model: "claude-3-opus-20240229", messages: [{ role: "user", content: "Hello!" }], stream: true }); for await (const chunk of stream) { process.stdout.write(chunk.choices[0]?.delta?.content ?? ""); } // Tool/Function calling const result = await client.chat.completions.create({ model: "claude-3-opus-20240229", messages: [{ role: "user", content: "Analyze this data" }], tools: [{ type: "function", function: { name: "analyze", parameters: { type: "object", properties: { sentiment: { type: "string" } } } } }] }); ``` ### Google (Gemini) The llm-polyglot library provides support for Google's Gemini API including: - Standard chat completions with OpenAI-compatible interface - Streaming chat completions with delta updates - Function/tool calling with automatic schema conversion - Context caching for token optimization (requires paid API key) - Grounding support with Google Search integration - Safety settings and model generation config - Session management for stateful conversations - Automatic response transformation with source attribution The Google generative-ai sdk is required when using the google provider: ```bash bun add @google/generative-ai ``` To use any of the above functionality, the schema matches OpenAI's format since we translate the OpenAI params spec into Gemini's model spec. #### Basic Usage ```typescript const client = createLLMClient({ provider: "google" }); // Standard completion const completion = await client.chat.completions.create({ model: "gemini-1.5-flash-latest", messages: [{ role: "user", content: "Hello!" }], max_tokens: 1000 }); // With grounding (Google Search) const groundedCompletion = await client.chat.completions.create({ model: "gemini-1.5-flash-latest", messages: [{ role: "user", content: "What are the latest AI developments?" }], groundingThreshold: 0.7, max_tokens: 1000 }); // With safety settings const safeCompletion = await client.chat.completions.create({ model: "gemini-1.5-flash-latest", messages: [{ role: "user", content: "Tell me a story" }], additionalProperties: { safetySettings: [{ category: "HARM_CATEGORY_HARASSMENT", threshold: "BLOCK_MEDIUM_AND_ABOVE" }] } }); // With session management const sessionCompletion = await client.chat.completions.create({ model: "gemini-1.5-flash-latest", messages: [{ role: "user", content: "Remember this: I'm Alice" }], additionalProperties: { sessionId: "user-123" } }); ``` #### Context Caching [Context Caching](https://ai.google.dev/gemini-api/docs/caching) is a feature specific to Gemini that helps cut down on duplicate token usage by allowing you to create a cache with a TTL: ```typescript // Create a cache const cache = await client.cacheManager.create({ model: "gemini-1.5-flash-8b", messages: [{ role: "user", content: "Context to cache" }], ttlSeconds: 3600 // Cache for 1 hour }); // Use the cached context const completion = await client.chat.completions.create({ model: "gemini-1.5-flash-8b", messages: [{ role: "user", content: "Follow-up question" }], additionalProperties: { cacheName: cache.name } }); ``` #### Function/Tool Calling ```typescript const completion = await client.chat.completions.create({ model: "gemini-1.5-flash-latest", messages: [{ role: "user", content: "Analyze this data" }], tools: [{ type: "function", function: { name: "analyze", parameters: { type: "object", properties: { sentiment: { type: "string" } } } } }], tool_choice: { type: "function", function: { name: "analyze" } } }); ``` ## Error Handling ```