# browser-llm-engine A **browser-friendly** library for running large language models (LLMs) directly in the browser using [Wllama](https://github.com/nadchif/wllama). This library provides a simple interface to load `.gguf` or `.bin` models (e.g., from Hugging Face) and generate text completions, including **streaming token** support. --- ## Features - **Plug-and-Play**: Easy to integrate into your web projects. - **Local or Remote Models**: Load a URL from Hugging Face or pass local `File` objects. - **Token-by-Token Streaming**: Handle partial results in real-time via `onNewToken` callback. - **Templates**: Leverages [Jinja](https://github.com/huggingface/jinja) to format chat-based prompts. - **Lightweight**: Bundles a minimal set of dependencies. --- ## Table of Contents 1. [Installation](#installation) 2. [Usage](#usage) - [Quick Start](#quick-start) - [Streaming](#streaming) - [Loading Local Files](#loading-local-files) 3. [Preset Models](#preset-models) 4. [API](#api) - [createLlmEngine](#creatllmengine) - [loadModel](#loadmodel) - [formatChat](#formatchat) - [createCompletion](#createcompletion) - [exit](#exit) 5. [Local Development](#local-development) 6. [License](#license) --- ## Installation ```bash npm install browser-llm-engine ``` Or with Yarn: ```bash yarn add browser-llm-engine ``` --- ## Usage ### Quick Start ```js import { createLlmEngine, CHAT_ROLE, PRESET_MODELS } from 'browser-llm-engine'; (async () => { // 1) Create an engine instance const llm = createLlmEngine({ // Optional: provide custom WASM paths or config wasmPaths: {} }); // 2) Load a preset model from the library const modelUrl = PRESET_MODELS["SmolLM2 (360M)"].url; await llm.loadModel(modelUrl, { progressCallback: (progress) => console.log(`Loading: ${progress}%`), }); // 3) Generate a completion const result = await llm.createCompletion("Hello from the browser!"); console.log("Full model response:", result); // 4) Clean up await llm.exit(); })(); ``` That’s it! You have a working LLM in the browser. --- ### Streaming To get partial tokens as they are generated, supply an `onNewToken` callback: ```js const llm = createLlmEngine(); await llm.loadModel(PRESET_MODELS["SmolLM2 (360M)"].url); let outputSoFar = ""; await llm.createCompletion("What's the weather today?", { nPredict: 128, sampling: { temp: 0.7, penalty_repeat: 1.1 }, onNewToken: (token) => { outputSoFar += token; console.log("Streamed token:", token); } }); console.log("Final streamed output:", outputSoFar); ``` --- ### Loading Local Files If you want to load the model from your local machine: ```html ``` --- ## Preset Models The library includes a `models.json` with references to a few hosted models. You can get them via: ```js import { PRESET_MODELS } from 'browser-llm-engine'; console.log("Available models:", PRESET_MODELS); ``` Feel free to add or remove entries if you fork this library. --- ## API ### `createLlmEngine(config?)` Creates a new engine instance. - **Parameters:** - `config` (Object) – Optional configuration, e.g. `{ wasmPaths: { ... } }`. ### `loadModel(source, options?)` Loads the model from either a remote URL or local `File` objects. - **Parameters:** - `source` (String | File[] | FileList) – The source of the model. - `options` (Object) – Additional load options: - `progressCallback` (function): `(progress) => {}` for tracking loading progress - `useCache` (Boolean): Cache the model for faster reloads - `allowOffline` (Boolean): If false, tries to fetch from network ### `formatChat(messages, useProvidedTemplate?)` Takes an array of messages (each with `role` and `content`) and formats them into a single prompt with Jinja. ### `createCompletion(prompt, options?)` Creates the text completion for a given `prompt`. - **Parameters:** - `prompt` (String) – The text to generate from. - `options` (Object) – Fine-tuning generation: - `nPredict` (Number) – Maximum tokens to predict (default 512) - `sampling` (Object) – e.g. `{ temp: 0.7, penalty_repeat: 1.1 }` - `onNewToken` (function) – A callback for streaming tokens ### `exit()` Cleans up resources used by Wllama. - **Example:** ```js await llm.exit(); ``` --- ## Local Development If you want to **develop locally**: 1. Clone the repo: ```bash git clone https://github.com/you/browser-llm-engine.git cd browser-llm-engine ``` 2. Install dependencies: ```bash npm install ``` 3. Build the library: ```bash npm run build ``` This will create `dist/` with both ESM and CJS bundles. 4. _(Optional)_ Start a dev server (if you add a script in `package.json`): ```bash npm run dev ``` 5. Open `index.html` (or any dev test page) in your browser to play around with the library. --- ## License This project is released under the [MIT License](./LICENSE). Feel free to fork, adapt, and contribute! --- **Happy coding and enjoy using your LLM in the browser!**