# Transliter

[![NPM Version](https://img.shields.io/npm/v/transliter.svg)](https://www.npmjs.com/package/transliter)
[![NPM Download](https://img.shields.io/npm/dm/transliter.svg)](https://www.npmjs.com/package/transliter)
[![License](https://img.shields.io/npm/l/transliter.svg)](https://github.com/vladzadvorny/transliter/blob/master/LICENSE)

Transliterate Cyrillic text to Latin with support for multiple standards and languages. Supports Russian, Ukrainian, Bulgarian, Macedonian, Serbian, and other Cyrillic-based languages.

## Features

- **Multiple transliteration standards**: ISO 9, BGN/PCGN, GOST, and Simple URL-friendly
- **Language detection**: Detect Cyrillic text and identify specific languages
- **URL slug generation**: Create SEO-friendly URLs from Cyrillic text
- **Backward compatible**: Maintains original API while adding new features
- **Extensible**: Add custom transliteration standards

## Installation

```sh
npm install transliter
```

## Quick Start

```javascript
const { transliter, slugify, isCyrillic } = require("transliter");

// Basic transliteration (ISO 9 standard)
transliter("Транслитерируемый текст");
//-> Transliteriruemy`j tekst

// URL slug generation
slugify("Создание ссылки");
//-> sozdanie-ssylki

// Cyrillic detection
isCyrillic("Привет, мир!"); //-> true
isCyrillic("Hello, World!"); //-> false
```

## API Reference

### `transliter(text, [standard])`

Transliterate Cyrillic text to Latin using the specified standard.

**Parameters:**

- `text` (string): Text to transliterate
- `standard` (string, optional): Transliteration standard. Default: `'iso9'`
  - `'iso9'`: ISO 9 System B (scientific/linguistic)
  - `'bgn-pcgn'`: BGN/PCGN (US Board on Geographic Names)
  - `'gost'`: GOST 7.79-2000 System B (Russian standard)
  - `'simple'`: Simple URL-friendly transliteration

**Returns:** (string) Transliterated text

**Examples:**

```javascript
// Different standards produce different results
transliter("Щука", "iso9"); //-> Shhuka
transliter("Щука", "bgn-pcgn"); //-> Shchuka
transliter("Щука", "gost"); //-> Shhuka
transliter("Щука", "simple"); //-> schuka

// Language support
transliter("Україна", "simple"); //-> ukraina
transliter("България", "simple"); //-> balgariya
transliter("Македонија", "simple"); //-> makedonija
```

### `transliter.getStandards()`

Get information about available transliteration standards.

**Returns:** (object) Object with standard names as keys and their metadata

**Example:**

```javascript
const standards = transliter.getStandards();
console.log(standards.iso9.name); //-> 'ISO 9 System B'
```

### `transliter.getStandard(standard)`

Get details about a specific standard.

**Parameters:**

- `standard` (string): Standard name

**Returns:** (object|null) Standard details or null if not found

### `transliter.addStandard(name, standard)`

Add a custom transliteration standard.

**Parameters:**

- `name` (string): Standard identifier
- `standard` (object): Standard object with `{name, description, map}` properties

---

### `slugify(text, [options])`

Generate URL slug from Cyrillic text.

**Parameters:**

- `text` (string): Text to convert to slug
- `options` (string|object): Configuration options
  - If string: treated as separator (backward compatibility)
  - If object: `{ separator, standard, lowercase, trim, remove }`

**Options:**

- `separator` (string): Separator character (default: `'-'`)
- `standard` (string): Transliteration standard (default: `'simple'`)
- `lowercase` (boolean): Convert to lowercase (default: `true`)
- `trim` (boolean): Trim whitespace (default: `true`)
- `remove` (RegExp): Regex pattern of characters to remove (default: `/[^a-zA-Z0-9-_]/g`)

**Returns:** (string) Generated slug

**Examples:**

```javascript
// Basic usage
slugify("Создание ссылки");
//-> sozdanie-ssylki

// Custom separator
slugify("Создание ссылки", "_");
//-> sozdanie_ssylki

// Advanced options
slugify("Тест!@#$%", {
  separator: "_",
  standard: "iso9",
  remove: /[^a-zA-Z0-9_]/g,
});
//-> test

// Standard-specific methods
slugify.iso9("Щука"); //-> shhuka
slugify.bgnPcgn("Щука"); //-> shchuka
slugify.gost("Щука"); //-> shhuka
```

---

### `isCyrillic(text, [language])`

Detect if text contains Cyrillic characters.

**Parameters:**

- `text` (string): Text to check
- `language` (string, optional): Specific language to check for. Default: `'any'`
  - `'any'`: Any Cyrillic script
  - `'russian'`: Russian language characters
  - `'ukrainian'`: Ukrainian language characters
  - `'bulgarian'`: Bulgarian language characters
  - `'macedonian'`: Macedonian language characters
  - `'serbian'`: Serbian language characters
  - `'church-slavonic'`: Church Slavonic characters

**Returns:** (boolean) True if text contains Cyrillic characters

**Examples:**

```javascript
// Basic detection
isCyrillic("Привет, мир!"); //-> true
isCyrillic("Hello, World!"); //-> false

// Language-specific detection
isCyrillic("Україна", "ukrainian"); //-> true
isCyrillic("България", "bulgarian"); //-> true
isCyrillic("Hello", "russian"); //-> false
```

### `isCyrillic.language(text, language)`

Detect specific Cyrillic language.

### `isCyrillic.percentage(text)`

Get percentage of Cyrillic characters in text.

### `isCyrillic.detectLanguages(text)`

Detect which Cyrillic languages are present in the text.

**Example:**

```javascript
isCyrillic.percentage("Привет Hello"); //-> 57.14 (approx)
isCyrillic.detectLanguages("Привет Україна"); //-> ['russian', 'ukrainian']
```

## Supported Languages

- **Russian** (ру́сский)
- **Ukrainian** (украї́нська)
- **Bulgarian** (бълга́рски)
- **Macedonian** (македонски)
- **Serbian** (српски / srpski)
- **Church Slavonic** (церковнославя́нский)
- And other Cyrillic-based languages

## Transliteration Standards

### ISO 9 System B

Scientific transliteration standard (ISO 9:1995 System B). Used in linguistics and academic publications. Preserves one-to-one character mapping with diacritics.

### BGN/PCGN

United States Board on Geographic Names & Permanent Committee on Geographical Names. Used for geographic names on maps and in official documents.

### GOST 7.79-2000 System B

Russian standard for bibliographic references. Commonly used in Russian publications and libraries.

### Simple

URL-friendly transliteration optimized for web addresses. Removes diacritics and uses common English equivalents.

## Building from Source

```sh
# Clone repository
git clone https://github.com/vladzadvorny/transliter.git
cd transliter

# Install dependencies
npm install

# Run tests
npm test
```

## License

MIT © Vlad Zadvorny
