# Reference ## Text

client.text.translate({ ...params }) -> SarvamAI.TranslationResponse

#### 📝 Description

**Translation** converts text from one language to another while preserving its meaning. For Example: **'मैं ऑफिस जा रहा हूँ'** translates to **'I am going to the office'** in English, where the script and language change, but the original meaning remains the same. Available languages: - **`bn-IN`**: Bengali - **`en-IN`**: English - **`gu-IN`**: Gujarati - **`hi-IN`**: Hindi - **`kn-IN`**: Kannada - **`ml-IN`**: Malayalam - **`mr-IN`**: Marathi - **`od-IN`**: Odia - **`pa-IN`**: Punjabi - **`ta-IN`**: Tamil - **`te-IN`**: Telugu ### Newly added languages: - **`as-IN`**: Assamese - **`brx-IN`**: Bodo - **`doi-IN`**: Dogri - **`kok-IN`**: Konkani - **`ks-IN`**: Kashmiri - **`mai-IN`**: Maithili - **`mni-IN`**: Manipuri (Meiteilon) - **`ne-IN`**: Nepali - **`sa-IN`**: Sanskrit - **`sat-IN`**: Santali - **`sd-IN`**: Sindhi - **`ur-IN`**: Urdu For hands-on practice, you can explore the notebook tutorial on [Translate API Tutorial](https://github.com/sarvamai/sarvam-ai-cookbook/blob/main/notebooks/translate/Translate_API_Tutorial.ipynb).

#### 🔌 Usage

```typescript await client.text.translate({ input: "input", source_language_code: "auto", target_language_code: "bn-IN" }); ```

#### ⚙️ Parameters

**request:** `SarvamAI.TranslationRequest`

**requestOptions:** `TextClient.RequestOptions`

client.text.identifyLanguage({ ...params }) -> SarvamAI.LanguageIdentificationResponse

#### 📝 Description

Identifies the language (e.g., en-IN, hi-IN) and script (e.g., Latin, Devanagari) of the input text, supporting multiple languages.

#### 🔌 Usage

```typescript await client.text.identifyLanguage({ input: "input" }); ```

#### ⚙️ Parameters

**request:** `SarvamAI.LanguageIdentificationRequest`

**requestOptions:** `TextClient.RequestOptions`

client.text.transliterate({ ...params }) -> SarvamAI.TransliterationResponse

#### 📝 Description

**Transliteration** converts text from one script to another while preserving the original pronunciation. For example, **'नमस्ते'** becomes **'namaste'** in English, and **'how are you'** can be written as **'हाउ आर यू'** in Devanagari. This process ensures that the sound of the original text remains intact, even when written in a different script. Transliteration is useful when you want to represent words phonetically across different writing systems, such as converting **'मैं ऑफिस जा रहा हूँ'** to **'main office ja raha hun'** in English letters. **Translation**, on the other hand, converts text from one language to another while preserving the meaning rather than pronunciation. For example, **'मैं ऑफिस जा रहा हूँ'** translates to **'I am going to the office'** in English, changing both the script and the language while conveying the intended message. ### Examples of **Transliteration**: - **'Good morning'** becomes **'गुड मॉर्निंग'** in Hindi, where the pronunciation is preserved but the meaning is not translated. - **'सुप्रभात'** becomes **'suprabhat'** in English. Available languages: - **`en-IN`**: English - **`hi-IN`**: Hindi - **`bn-IN`**: Bengali - **`gu-IN`**: Gujarati - **`kn-IN`**: Kannada - **`ml-IN`**: Malayalam - **`mr-IN`**: Marathi - **`od-IN`**: Odia - **`pa-IN`**: Punjabi - **`ta-IN`**: Tamil - **`te-IN`**: Telugu For hands-on practice, you can explore the notebook tutorial on [Transliterate API Tutorial](https://github.com/sarvamai/sarvam-ai-cookbook/blob/main/notebooks/transliterate/Transliterate_API_Tutorial.ipynb).

#### 🔌 Usage

```typescript await client.text.transliterate({ input: "input", source_language_code: "auto", target_language_code: "bn-IN" }); ```

#### ⚙️ Parameters

**request:** `SarvamAI.TransliterationRequest`

**requestOptions:** `TextClient.RequestOptions`

## SpeechToText

client.speechToText.transcribe({ ...params }) -> SarvamAI.SpeechToTextResponse

#### 📝 Description

## Speech to Text API This API transcribes speech to text in multiple Indian languages and English. Supports transcription for interactive applications. ### Available Options: - **REST API** (Current Endpoint): For quick responses under 30 seconds with immediate results - **Batch API**: For longer audio files, [Follow This Documentation](https://docs.sarvam.ai/api-reference-docs/api-guides-tutorials/speech-to-text/batch-api) - Supports diarization (speaker identification) ### Note: - Pricing differs for REST and Batch APIs - Diarization is only available in Batch API with separate pricing - Please refer to [here](https://docs.sarvam.ai/api-reference-docs/getting-started/pricing) for detailed pricing information

#### 🔌 Usage

```typescript await client.speechToText.transcribe({ file: fs.createReadStream("/path/to/your/file") }); ```

#### ⚙️ Parameters

**request:** `SarvamAI.SpeechToTextTranscriptionRequest`

**requestOptions:** `SpeechToTextClient.RequestOptions`

client.speechToText.translate({ ...params }) -> SarvamAI.SpeechToTextTranslateResponse

#### 📝 Description

## Speech to Text Translation API This API automatically detects the input language, transcribes the speech, and translates the text to English. ### Available Options: - **REST API** (Current Endpoint): For quick responses under 30 seconds with immediate results - **Batch API**: For longer audio files [Follow this documentation](https://docs.sarvam.ai/api-reference-docs/api-guides-tutorials/speech-to-text/batch-api) - Supports diarization (speaker identification) ### Note: - Pricing differs for REST and Batch APIs - Diarization is only available in Batch API with separate pricing - Please refer to [here](https://docs.sarvam.ai/api-reference-docs/getting-started/pricing) for detailed pricing information

#### 🔌 Usage

```typescript await client.speechToText.translate({ file: fs.createReadStream("/path/to/your/file") }); ```

#### ⚙️ Parameters

**request:** `SarvamAI.SpeechToTextTranslationRequest`

**requestOptions:** `SpeechToTextClient.RequestOptions`

## TextToSpeech

client.textToSpeech.convert({ ...params }) -> SarvamAI.TextToSpeechResponse

#### 📝 Description

Convert text into spoken audio. The output is a base64-encoded audio string that must be decoded before use. **Available Models:** - **bulbul:v3**: Latest model with improved quality, 30+ voices, and temperature control - **bulbul:v2**: Legacy model with pitch and loudness controls **Important Notes for bulbul:v3:** - Pitch and loudness parameters are NOT supported - Pace range: 0.5 to 2.0 - Preprocessing is automatically enabled - Default sample rate is 24000 Hz - Supports sample rates: 8000, 16000, 22050, 24000 Hz (REST API also supports 32000, 44100, 48000 Hz)

#### 🔌 Usage

```typescript await client.textToSpeech.convert({ text: "text", target_language_code: "bn-IN" }); ```

#### ⚙️ Parameters

**request:** `SarvamAI.TextToSpeechRequest`

**requestOptions:** `TextToSpeechClient.RequestOptions`

client.textToSpeech.convertStream({ ...params }) -> core.BinaryResponse

#### 📝 Description

Converts the input text into a streamed spoken audio response. This endpoint supports streaming audio using the specified output codec (e.g., `audio/mpeg` for MP3). The response is returned as a binary audio stream, which can be played or saved directly by the client. Supports the `dict_id` parameter to apply a [pronunciation dictionary](https://docs.sarvam.ai/api-reference-docs/pronunciation-dictionary/create) during synthesis.

#### 🔌 Usage

```typescript await client.textToSpeech.convertStream({ text: "x" }); ```

#### ⚙️ Parameters

**request:** `SarvamAI.TextToSpeechStreamRequest`

**requestOptions:** `TextToSpeechClient.RequestOptions`

## PronunciationDictionary

client.pronunciationDictionary.list() -> SarvamAI.PronunciationDictionaryGetResponse

#### 📝 Description

Retrieve a list of all pronunciation dictionary IDs associated with the authenticated user.

#### 🔌 Usage

```typescript await client.pronunciationDictionary.list(); ```

#### ⚙️ Parameters

**requestOptions:** `PronunciationDictionaryClient.RequestOptions`

client.pronunciationDictionary.create({ ...params }) -> SarvamAI.PronunciationDictionaryResponse

#### 📝 Description

Upload a `.json` file to create a new pronunciation dictionary. Only supported by **bulbul:v3**. The file should contain a JSON object with a `pronunciations` key mapping language codes to word-pronunciation pairs. See the [Pronunciation Dictionary guide](/api-reference-docs/api-guides-tutorials/text-to-speech/pronunciation-dictionary) for format details and examples. The returned `dictionary_id` can be passed as `dict_id` in text-to-speech requests (REST, HTTP Stream, and WebSocket). **Limits:** Max 10 dictionaries per user, 100 words per dictionary, 1 MB file size.

#### 🔌 Usage

```typescript await client.pronunciationDictionary.create({ file: fs.createReadStream("/path/to/your/file") }); ```

#### ⚙️ Parameters

**request:** `SarvamAI.CreatePronunciationDictionaryRequest`

**requestOptions:** `PronunciationDictionaryClient.RequestOptions`

client.pronunciationDictionary.update({ ...params }) -> SarvamAI.PronunciationDictionaryUpdateResponse

#### 📝 Description

Update an existing pronunciation dictionary by uploading a JSON file. You can add new words, change existing pronunciations, or both — entries not included in the uploaded file remain unchanged. **Limits:** Max 100 words per dictionary, 1 MB file size. The response includes the `dictionary_id` and the updated pronunciation mappings for verification.

#### 🔌 Usage

```typescript await client.pronunciationDictionary.update({ file: fs.createReadStream("/path/to/your/file"), dict_id: "dict_id" }); ```

#### ⚙️ Parameters

**request:** `SarvamAI.UpdatePronunciationDictionaryRequest`

**requestOptions:** `PronunciationDictionaryClient.RequestOptions`

client.pronunciationDictionary.delete({ ...params }) -> SarvamAI.PronunciationDictionaryDeleteResponse

#### 📝 Description

Delete a pronunciation dictionary by its ID. Once deleted, the dictionary can no longer be referenced in text-to-speech requests.

#### 🔌 Usage

```typescript await client.pronunciationDictionary.delete({ dict_id: "dict_id" }); ```

#### ⚙️ Parameters

**request:** `SarvamAI.PronunciationDictionaryDeleteRequest`

**requestOptions:** `PronunciationDictionaryClient.RequestOptions`

client.pronunciationDictionary.get(dict_id) -> SarvamAI.PronunciationDictionaryData

#### 📝 Description

Retrieve the full pronunciation mappings for a specific dictionary by its ID. Returns the pronunciation data organized by language code, where each language contains word-to-pronunciation pairs.

#### 🔌 Usage

```typescript await client.pronunciationDictionary.get("dict_id"); ```

#### ⚙️ Parameters

**dict_id:** `string`

**requestOptions:** `PronunciationDictionaryClient.RequestOptions`

## Chat

client.chat.completions({ ...params }) -> SarvamAI.CreateChatCompletionResponse

#### 🔌 Usage

```typescript await client.chat.completions({ messages: [{ role: "assistant" }], model: "sarvam-105b" }); ```

#### ⚙️ Parameters

**request:** `SarvamAI.ChatCompletionsRequest`

**requestOptions:** `ChatClient.RequestOptions`

## SpeechToTextJob

client.speechToTextJob.initialise({ ...params }) -> SarvamAI.BulkJobInitResponse

#### 📝 Description

Create a new speech to text bulk job and receive a job UUID and storage folder details for processing multiple audio files. Set `job_parameters.input_audio_codec` when uploads are raw PCM (`pcm_s16le`, `pcm_l16`, or `pcm_raw`); the API auto-detects other formats. PCM must be 16 kHz.

#### 🔌 Usage

```typescript await client.speechToTextJob.initialise({ job_parameters: {} }); ```

#### ⚙️ Parameters

**request:** `SarvamAI.SpeechToTextJobRequest`

**requestOptions:** `SpeechToTextJobClient.RequestOptions`

client.speechToTextJob.getStatus(job_id) -> SarvamAI.JobStatusResponse

#### 📝 Description

Retrieve the current status and details of a speech to text bulk job, including progress and file-level information. **Rate Limiting Best Practice:** To prevent rate limit errors and ensure optimal server performance, we recommend implementing a minimum 5-millisecond delay between consecutive status polling requests. This helps maintain system stability while still providing timely status updates.

#### 🔌 Usage

```typescript await client.speechToTextJob.getStatus("job_id"); ```

#### ⚙️ Parameters

**job_id:** `string` — The unique identifier of the job

**requestOptions:** `SpeechToTextJobClient.RequestOptions`

client.speechToTextJob.start(job_id, { ...params }) -> SarvamAI.JobStatusResponse

#### 📝 Description

Start processing a speech to text bulk job after all audio files have been uploaded

#### 🔌 Usage

```typescript await client.speechToTextJob.start("job_id", { ptu_id: 1 }); ```

#### ⚙️ Parameters

**job_id:** `string` — The unique identifier of the job

**request:** `SarvamAI.SpeechToTextJobStartRequest`

**requestOptions:** `SpeechToTextJobClient.RequestOptions`

client.speechToTextJob.getUploadLinks({ ...params }) -> SarvamAI.FilesUploadResponse

#### 📝 Description

Generate presigned upload URLs for audio files that will be processed in a speech to text bulk job

#### 🔌 Usage

```typescript await client.speechToTextJob.getUploadLinks({ job_id: "job_id", files: ["files"] }); ```

#### ⚙️ Parameters

**request:** `SarvamAI.FilesRequest`

**requestOptions:** `SpeechToTextJobClient.RequestOptions`

client.speechToTextJob.getDownloadLinks({ ...params }) -> SarvamAI.FilesDownloadResponse

#### 📝 Description

Generate presigned download URLs for the transcription output files of a completed speech to text bulk job

#### 🔌 Usage

```typescript await client.speechToTextJob.getDownloadLinks({ job_id: "job_id", files: ["files"] }); ```

#### ⚙️ Parameters

**request:** `SarvamAI.FilesRequest`

**requestOptions:** `SpeechToTextJobClient.RequestOptions`

## SpeechToTextTranslateJob

client.speechToTextTranslateJob.initialise({ ...params }) -> SarvamAI.BulkJobInitResponse

#### 📝 Description

Create a new speech to text translate bulk job and receive a job UUID and storage folder details for processing multiple audio files with translation. Set `job_parameters.input_audio_codec` when uploads are raw PCM (`pcm_s16le`, `pcm_l16`, or `pcm_raw`); the API auto-detects other formats. PCM must be 16 kHz.

#### 🔌 Usage

```typescript await client.speechToTextTranslateJob.initialise({ ptu_id: 1, job_parameters: {} }); ```

#### ⚙️ Parameters

**request:** `SarvamAI.SpeechToTextTranslateJobRequest`

**requestOptions:** `SpeechToTextTranslateJobClient.RequestOptions`

client.speechToTextTranslateJob.getStatus(job_id) -> SarvamAI.JobStatusResponse

#### 📝 Description

Retrieve the current status and details of a speech to text translate bulk job, including progress and file-level information. **Rate Limiting Best Practice:** To prevent rate limit errors and ensure optimal server performance, we recommend implementing a minimum 5-millisecond delay between consecutive status polling requests. This helps maintain system stability while still providing timely status updates.

#### 🔌 Usage

```typescript await client.speechToTextTranslateJob.getStatus("job_id"); ```

#### ⚙️ Parameters

**job_id:** `string` — The unique identifier of the job

**requestOptions:** `SpeechToTextTranslateJobClient.RequestOptions`

client.speechToTextTranslateJob.start(job_id, { ...params }) -> SarvamAI.JobStatusResponse

#### 📝 Description

Start processing a speech to text translate bulk job after all audio files have been uploaded

#### 🔌 Usage

```typescript await client.speechToTextTranslateJob.start("job_id", { ptu_id: 1 }); ```

#### ⚙️ Parameters

**job_id:** `string` — The unique identifier of the job

**request:** `SarvamAI.SpeechToTextTranslateJobStartRequest`

**requestOptions:** `SpeechToTextTranslateJobClient.RequestOptions`

client.speechToTextTranslateJob.getUploadLinks({ ...params }) -> SarvamAI.FilesUploadResponse

#### 📝 Description

Generate presigned upload URLs for audio files that will be processed in a speech to text translate bulk job

#### 🔌 Usage

```typescript await client.speechToTextTranslateJob.getUploadLinks({ ptu_id: 1, body: { job_id: "job_id", files: ["files"] } }); ```

#### ⚙️ Parameters

**request:** `SarvamAI.SpeechToTextTranslateJobGetUploadLinksRequest`

**requestOptions:** `SpeechToTextTranslateJobClient.RequestOptions`

client.speechToTextTranslateJob.getDownloadLinks({ ...params }) -> SarvamAI.FilesDownloadResponse

#### 📝 Description

Generate presigned download URLs for the translated transcription output files of a completed speech to text translate bulk job

#### 🔌 Usage

```typescript await client.speechToTextTranslateJob.getDownloadLinks({ ptu_id: 1, body: { job_id: "job_id", files: ["files"] } }); ```

#### ⚙️ Parameters

**request:** `SarvamAI.SpeechToTextTranslateJobGetDownloadLinksRequest`

**requestOptions:** `SpeechToTextTranslateJobClient.RequestOptions`

## DocumentIntelligence

client.documentIntelligence.initialise({ ...params }) -> SarvamAI.DocDigitizationCreateJobResponse

#### 📝 Description

Creates a new Document Intelligence job. **Supported Languages (BCP-47 format):** - `hi-IN`: Hindi (default) - `en-IN`: English - `bn-IN`: Bengali - `gu-IN`: Gujarati - `kn-IN`: Kannada - `ml-IN`: Malayalam - `mr-IN`: Marathi - `or-IN`: Odia - `pa-IN`: Punjabi - `ta-IN`: Tamil - `te-IN`: Telugu - `ur-IN`: Urdu - `as-IN`: Assamese - `bodo-IN`: Bodo - `doi-IN`: Dogri - `ks-IN`: Kashmiri - `kok-IN`: Konkani - `mai-IN`: Maithili - `mni-IN`: Manipuri - `ne-IN`: Nepali - `sa-IN`: Sanskrit - `sat-IN`: Santali - `sd-IN`: Sindhi **Output Formats (delivered as ZIP file):** - `html`: Structured HTML files with layout preservation - `md`: Markdown files (default) - `json`: Structured JSON files for programmatic processing

#### 🔌 Usage

```typescript await client.documentIntelligence.initialise(); ```

#### ⚙️ Parameters

**request:** `SarvamAI.DocumentIntelligenceJobRequest`

**requestOptions:** `DocumentIntelligenceClient.RequestOptions`

client.documentIntelligence.getUploadLinks({ ...params }) -> SarvamAI.DocDigitizationUploadFilesResponse

#### 📝 Description

Returns presigned URLs for uploading input files. **File Constraints:** - Exactly one file required (PDF or ZIP) - PDF files: `.pdf` extension - ZIP files: `.zip` extension

#### 🔌 Usage

```typescript await client.documentIntelligence.getUploadLinks({ job_id: "job_id", files: ["files"] }); ```

#### ⚙️ Parameters

**request:** `SarvamAI.DocDigitizationUploadFilesRequest`

**requestOptions:** `DocumentIntelligenceClient.RequestOptions`

client.documentIntelligence.start(job_id) -> SarvamAI.DocDigitizationJobStatusResponse

#### 📝 Description

Validates the uploaded file and starts processing. **Validation Checks:** - File must be uploaded before starting - File size must not exceed 200 MB - PDF must be parseable by the PDF parser - ZIP must contain only JPEG/PNG images - ZIP must be flat (no nested folders beyond one level) - ZIP must contain at least one valid image - Page/image count must not exceed 10 (returns `422` with `max_page_limit_exceeded` if exceeded) - User must have sufficient credits **Processing:** Job runs asynchronously. Poll the status endpoint or use webhook callback for completion notification.

#### 🔌 Usage

```typescript await client.documentIntelligence.start("job_id"); ```

#### ⚙️ Parameters

**job_id:** `string` — The unique identifier of the job

**requestOptions:** `DocumentIntelligenceClient.RequestOptions`

client.documentIntelligence.getStatus(job_id) -> SarvamAI.DocDigitizationJobStatusResponse

#### 📝 Description

Returns the current status of a job with page-level metrics. **Job States:** - `Accepted`: Job created, awaiting file upload - `Pending`: File uploaded, waiting to start - `Running`: Processing in progress - `Completed`: All pages processed successfully - `PartiallyCompleted`: Some pages succeeded, some failed - `Failed`: All pages failed or job-level error **Page Metrics:** Response includes detailed progress: total pages, pages processed, succeeded, failed, and per-page errors.

#### 🔌 Usage

```typescript await client.documentIntelligence.getStatus("job_id"); ```

#### ⚙️ Parameters

**job_id:** `string` — The unique identifier of the job

**requestOptions:** `DocumentIntelligenceClient.RequestOptions`

client.documentIntelligence.getDownloadLinks(job_id) -> SarvamAI.DocDigitizationDownloadFilesResponse

#### 📝 Description

Returns presigned URLs for downloading output files. **Prerequisites:** - Job must be in `Completed` or `PartiallyCompleted` state - Failed jobs have no output available

#### 🔌 Usage

```typescript await client.documentIntelligence.getDownloadLinks("job_id"); ```

#### ⚙️ Parameters

**job_id:** `string` — The unique identifier of the job

**requestOptions:** `DocumentIntelligenceClient.RequestOptions`