# nodejs-whisper

Node.js bindings for OpenAI's Whisper model.

[![MIT License](https://img.shields.io/badge/License-MIT-green.svg)](https://choosealicense.com/licenses/mit/)

## Features

-   Automatically convert the audio to WAV format with a 16000 Hz frequency to support the whisper model.
-   Output transcripts to (.txt .srt .vtt .json .wts .lrc)
-   Optimized for CPU (Including Apple Silicon ARM)
-   Timestamp precision to single word
-   Split on word rather than on token (Optional)
-   Translate from source language to english (Optional)
-   Convert audio format to wav to support whisper model

## Installation

1. Install make tools

```bash
sudo apt update
sudo apt install build-essential
```

2. Install nodejs-whisper with npm

```bash
  npm i nodejs-whisper
```

3. Download whisper model

```bash
  npx nodejs-whisper download
```

-   NOTE: user may need to install make tool

### Windows Installation

1. Install MinGW-w64 or MSYS2 (which includes make tools)
   - Option 1: Install MSYS2 from https://www.msys2.org/
   - Option 2: Install MinGW-w64 from https://www.mingw-w64.org/

2. Install nodejs-whisper with npm
```bash
npm i nodejs-whisper
```

3. Download whisper model
```bash
npx nodejs-whisper download
```

- Note: Make sure mingw32-make or make is available in your system PATH.

## Usage/Examples

See `example/index.ts` (can be run with `$ npm run test`)

```javascript
import path from 'path'
import { nodewhisper } from 'nodejs-whisper'

// Need to provide exact path to your audio file.
const filePath = path.resolve(__dirname, 'YourAudioFileName')

await nodewhisper(filePath, {
	modelName: 'base.en', //Downloaded models name
	modelRootPath: '/path/to/whisper/models', // (optional) directory containing the selected ggml model file
	autoDownloadModelName: 'base.en', // (optional) auto download a model if model is not present
	removeWavFileAfterTranscription: false, // (optional) remove wav file once transcribed
	withCuda: false, // (optional) use cuda for faster processing
	logger: console, // (optional) Logging instance, defaults to console
	whisperOptions: {
		outputInCsv: false, // get output result in csv file
		outputInJson: false, // get output result in json file
		outputInJsonFull: false, // get output result in json file including more information
		outputInLrc: false, // get output result in lrc file
		outputInSrt: true, // get output result in srt file
		outputInText: false, // get output result in txt file
		outputInVtt: false, // get output result in vtt file
		outputInWords: false, // get output result in wts file for karaoke
		translateToEnglish: false, // translate from source language to english
		wordTimestamps: false, // word-level timestamps
		timestamps_length: 20, // amount of dialogue per timestamp pair
		splitOnWord: true, // split on word rather than on token
		noGpu: false, // disable GPU inference
	},
})

// Model list
const MODELS_LIST = [
	'tiny',
	'tiny.en',
	'base',
	'base.en',
	'small',
	'small.en',
	'medium',
	'medium.en',
	'large-v1',
	'large',
	'large-v3-turbo',
]
```

Custom CMake flags can be passed with `NODEJS_WHISPER_CMAKE_ARGS`.

```bash
NODEJS_WHISPER_CMAKE_ARGS="-DGGML_NATIVE=OFF" npm test
```

When `modelRootPath` is used with `autoDownloadModelName`, downloaded models are saved in that directory.

Docker model cache example:

```yaml
volumes:
    - ./.docker-data/whisper-models:/data/whisper-models
```

```javascript
await nodewhisper(filePath, {
    modelName: 'tiny.en',
    autoDownloadModelName: 'tiny.en',
    modelRootPath: '/data/whisper-models',
    whisperOptions: {
        outputInSrt: true,
    },
})
```

The downloaded model will be stored at `/data/whisper-models/ggml-tiny.en.bin`, while the package's internal downloader scripts remain available.

## Types

```
 interface IOptions {
	modelName: string
	modelRootPath?: string
	removeWavFileAfterTranscription?: boolean
	withCuda?: boolean
	autoDownloadModelName?: string
	whisperOptions?: WhisperOptions
	logger?: Console
}

 interface WhisperOptions {
	outputInCsv?: boolean
	outputInJson?: boolean
	outputInJsonFull?: boolean
	outputInLrc?: boolean
	outputInSrt?: boolean
	outputInText?: boolean
	outputInVtt?: boolean
	outputInWords?: boolean
	translateToEnglish?: boolean
	timestamps_length?: number
	wordTimestamps?: boolean
	splitOnWord?: boolean
	noGpu?: boolean
}

```

## Run locally

Clone the project

```bash
  git clone https://github.com/ChetanXpro/nodejs-whisper
```

Go to the project directory

```bash
  cd nodejs-whisper
```

Install dependencies

```bash
  npm install
```

Start the server

```bash
  npm run dev
```

Build project

```bash
  npm run build
```

## Made with

-   [Whisper OpenAI (using C++ port by: ggerganov)](https://github.com/ggerganov/whisper.cpp)

## Feedback

If you have any feedback, please reach out to us at chetanbaliyan10@gmail.com

## Authors

-   [@chetanXpro](https://www.github.com/chetanXpro)
