# APG Unicode Parser

Parsers created with [`apg-js`](https://github.com/ldthomas/apg-js) and [`apg-lite`](https://github.com/ldthomas/apg-lite) operate on arrays of positive integers—typically representing character codes. The `apg-unicode` variant extends this by supporting **typed arrays**, enabling more memory-efficient parsing workflows for modern JavaScript environments.

> **Note:** `apg-unicode` does not natively parse Unicode. Instead, Unicode handling must be implemented via SABNF grammar and application logic. Typed arrays and conversion utilities simplify this process. See `./examples/unicode` for an illustration of UTF-8 and UTF-16 parsing without prior transformation.

## Key Features

### Typed Array Support

`apg-unicode` accepts the following input types:

- `Array`
- `Buffer`
- `Uint8Array`
- `Uint16Array`
- `Uint32Array`
- `String` (converted internally to `Uint32Array` of code points)

Using typed arrays—especially `Uint8Array`—can reduce memory usage by up to **75%** for large UTF-8 files.

### Substring Parsing

Efficiently parse substrings within large strings without slicing or reallocating. Ideal for partial parsing scenarios. See `./examples/substrings` for usage patterns.

## Parser Generation

Like `apg-lite`, `apg-unicode` does **not** include a parser generator. To generate a grammar object, for example:

```bash
npm run apg -- -i ./examples/stats/sip.bnf -o ./examples/stats/sip
```

## GitHub Usage

Clone the repo and run the user application and examples from the root directory:

```bash
git clone https://github.com/ldthomas/apg-unicode.git
cd apg-unicode
```

Include the modules in an application with:

```bash
import { Parser } from './src/parser.js';
import { Ast } from './src/ast.js';
import { Trace } from './src/tracer.js';
import { Stats } from './src/stats.js';
import { utilities } from './src/utilities.js';
import { identifiers } from './src/identifiers.js';
```

To run the examples use:

| Command                                        | Description                                                                                                                                                                     |
| ---------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `node examples/ast/main`                       | Demonstrates Abstract Syntax Tree (AST) usage                                                                                                                                   |
| `node examples/trace/main`                     | Traces the parser through the parse tree                                                                                                                                        |
| `node examples/stats/main`                     | Collects and displays node hit statistics                                                                                                                                       |
| `node examples/substrings/main`                | Parses substrings within a full input string                                                                                                                                    |
| `node examples/unicode/main`                   | Parses UTF-8 and UTF-16 directly without prior transformation to code points                                                                                                    |
| display `examples/web/web.html` in any browser | Illustrates running a parser in a web page. Note that `web-app.js` is created with [esbuild](https://github.com/evanw/esbuild) from `app.js`. Use the script `npm run esbuild`. |

## npm Usage

Install the repo from the npm registry. In the application root directory:

```bash
npm install apg-unicode

```

To access the modules in the application:

```bash
import { Parser, Ast, Trace, Stats, utilities, identifiers } from 'apg-unicode';
```

## Documentation

The documentation is in in the code in [docco](https://davidwalsh.name/javascript-documentation) format. To generate it use:

```bash
npm run docco
```

The documentation will then be in at `./docs/index.html`

Or view it [here](https://sabnf.com/docs/apg-unicode/index.html) on the APG website.

## License

`apg-unicode` is licensed under the permissive [MIT](https://github.com/ldthomas/apg-unicode?tab=License-1-ov-file) license.
