# stream-json > Micro-library of Node.js stream components for creating custom JSON processing pipelines with a minimal memory footprint. Parse JSON files far exceeding available memory using a SAX-inspired streaming token API. One dependency: `stream-chain`. ## Install npm i stream-json ## Quick start ```js import chain from 'stream-chain'; import {parser} from 'stream-json'; import {streamArray} from 'stream-json/streamers/stream-array.js'; import fs from 'node:fs'; const pipeline = chain([ fs.createReadStream('data.json'), parser(), streamArray(), ({value}) => console.log(value) ]); ``` ## API ### Parser `parser(options)` — streaming JSON parser producing `{name, value}` tokens. - Returns a function for use in `chain()`. Call `parser.asStream(options)` for a Node Duplex stream or `parser.asWebStream(options)` for a Web `{readable, writable}` pair. - Options: `packKeys`, `packStrings`, `packNumbers` (default: true), `streamKeys`, `streamStrings`, `streamNumbers` (default: true), `jsonStreaming` (default: false). - `packValues`/`streamValues` — shortcut to set all three at once. - `parser` is `gen(fixUtf8Stream(), jsonParser())`; the named export `jsonParser` is the bare tokenizer (no UTF-8 front) for advanced/embedding use. The same gen/raw split applies to `jsoncParser`, `jsonlParser`, and the verifiers (`jsonVerifier`, `jsoncVerifier`). ```js import {parser} from 'stream-json'; const pipeline = fs.createReadStream('data.json').pipe(parser.asStream()); // Web Streams substrate: import {parser} from 'stream-json/web/parser.js'; const {readable, writable} = parser.asWebStream(); ``` Every substrate-bearing component has both `stream-json/X.js` (Node + Web shapes) and `stream-json/web/X.js` (Web-only, browser-safe) entries. ### Main module The default export is `parserStream` — an alias for `parser.asStream()` that returns a parser as a Duplex stream: ```js import parserStream from 'stream-json'; const stream = parserStream(); fs.createReadStream('data.json').pipe(stream); ``` For the SAX-style event API on Node (`stream.on('startObject', ...)`), wrap with `emit()`: ```js import parserStream from 'stream-json'; import emit from 'stream-json/utils/emit.js'; const stream = emit(parserStream()); stream.on('startObject', () => { /* ... */ }); ``` For the SAX-style event API on Web, use the `EventTarget`-based variants from `stream-json/web/emitter.js` or `stream-json/web/utils/emit.js`; subscribe with `addEventListener(name, ev => ev.detail)`. For hot paths on either substrate, prefer `for await (const tok of readable) handlers[tok.name]?.(tok.value)` — zero per-token allocation. ### Assembler `Assembler` — class that reconstructs JS objects from tokens. Receives a per-value callback via the `onDone` option. ```js import Assembler from 'stream-json/assembler.js'; const asm = Assembler.connectTo(parserStream, {onDone: asm => console.log(asm.current)}); ``` - `asm.tapChain` — function for use in `chain()`. - `asm.onDone(fn)` — set/clear the callback after construction. - Options: `reviver`, `numberAsString`, `onDone`. ### Disassembler `disassembler(options)` — JS objects → token stream (generator). Has `asStream` (Node Duplex) and `asWebStream` (Web pair). Web-only entry: `stream-json/web/disassembler.js`. ```js import {disassembler} from 'stream-json/disassembler.js'; chain([objectSource, disassembler(), stringer(), destination]); ``` ### Stringer `Stringer` — Transform stream converting tokens back to JSON text. Has `asStream` (Node Duplex) and `asWebStream` (Web pair). Web-only entry: `stream-json/web/stringer.js`. ```js import {stringer} from 'stream-json/stringer.js'; chain([parser(), pick({filter: 'data'}), stringer(), destination]); ``` ### Emitter `Emitter` — sink that re-emits tokens as named events. Node version is a Writable (EventEmitter); Web version is an EventTarget with `.writable` `WritableStream` attached. ```js // Node import emitter from 'stream-json/emitter.js'; const e = emitter(); e.on('startObject', () => { /* ... */ }); // Web import emitter from 'stream-json/web/emitter.js'; const e = emitter(); e.addEventListener('startObject', () => { /* ... */ }); e.addEventListener('keyValue', ev => console.log(ev.detail)); // pipe a token-producing readable into e.writable ``` `Assembler.connectTo(stream, options)` (and `FlexAssembler.connectTo`) is substrate-aware — accepts either a Node Readable or a Web ReadableStream. For hot paths, prefer `for await (const tok of readable) asm.consume(tok)` over `connectTo` — no async-closure overhead, errors propagate directly. ## Filters All filters accept `{filter, pathSeparator, once, streamKeys}` options. `filter` can be a string, RegExp, or `(stack, chunk) => boolean`. - **`pick(options)`** — passes only matching subobjects, discards the rest. - **`replace(options)`** — replaces matching subobjects. Extra option: `replacement` (function, value, or array of tokens). - **`ignore(options)`** — removes matching subobjects completely. - **`filter(options)`** — keeps matching subobjects preserving surrounding structure. Each ships in both substrates with `asStream`, `asWebStream`, `withParser`, `withParserAsStream`, and `withParserAsWebStream` attached. Web-only entries: `stream-json/web/filters/.js`. ```js import {pick} from 'stream-json/filters/pick.js'; import {ignore} from 'stream-json/filters/ignore.js'; import {streamValues} from 'stream-json/streamers/stream-values.js'; chain([ parser(), pick({filter: 'data'}), ignore({filter: /\b_meta\b/i}), streamValues(), ({value}) => process(value) ]); ``` ## Streamers Assemble complete JS objects from a token stream. All produce `{key, value}` objects, generic in the assembled value type (`streamArray()`, `streamValues()`, `streamObject()`; `value` defaults to `unknown`). - **`streamValues(options)`** — streams successive JSON values. Use with `jsonStreaming` or after `pick`. - **`streamArray(options)`** — streams elements of a single top-level array. - **`streamObject(options)`** — streams properties of a single top-level object. All support `objectFilter` for early rejection of objects during assembly. Each ships in both substrates with `asStream`, `asWebStream`, `withParser`, `withParserAsStream`, and `withParserAsWebStream` attached. Web-only entries: `stream-json/web/streamers/.js`. ```js import {streamArray} from 'stream-json/streamers/stream-array.js'; chain([parser(), streamArray(), ({key, value}) => console.log(key, value)]); ``` ## Utilities - **`emit(stream)`** — attach token events to a Node Readable. Web variant (`stream-json/web/utils/emit.js`) takes a `ReadableStream` and returns an auto-piped `EventTarget`. Zero-allocation alternative: `for await (const tok of readable) handlers[tok.name]?.(tok.value)`. - **`withParser(fn, options)`** — create `gen(parser(options), fn(options))` pipeline. Most components export `.withParser()` and `.withParserAsStream()`. - **`FlexAssembler`** — Assembler with custom containers (Map, Set, etc.) at specific paths. Rules: `{filter, create, add, finalize?}`. Separate `objectRules` and `arrayRules`. - **`Batch`** — Transform stream batching items into arrays. Option: `batchSize` (default: 1000). Both `asStream` and `asWebStream` attach `_batchSize` to the returned pair/stream. - **`Verifier`** — validates JSON text (`charCodeAt` validator), reports exact error position. Has `asStream` and `asWebStream`. Named export `jsonVerifier` is the bare validator (no UTF-8 front). ### withParser shortcut ```js import {withParser} from 'stream-json/streamers/stream-array.js'; const pipeline = withParser(); fs.createReadStream('data.json').pipe(pipeline); ``` ## JSONL support > **Deprecated — slated for removal in a future major.** stream-json's JSONL parser and stringer are thin re-exports of stream-chain's (`stream-chain/jsonl/parser.js`, `stream-chain/jsonl/stringerStream.js`). Use stream-chain's JSONL directly. stream-json is a JSON *token* library; JSONL yields whole objects per line and belongs in stream-chain with the other substrate components. - **`jsonl/parser(options)`** — JSONL parser producing `{key, value}` objects. Options: `reviver`, `errorIndicator`. Has `asStream` and `asWebStream`. Web entry: `stream-json/web/jsonl/parser.js`. - **`jsonl/stringer(options)`** — objects → JSONL text. Options: `replacer`, `space`, `separator`. Node entry is itself a `Transform`; `jsonlStringer.asWebStream` returns a Web `TransformStream`. Web entry (`stream-json/web/jsonl/stringer.js`) returns the `TransformStream` directly. ```js import {parser} from 'stream-json/jsonl/parser.js'; import {stringer} from 'stream-json/jsonl/stringer.js'; chain([fs.createReadStream('data.jsonl'), parser(), ({value}) => transform(value), stringer(), destination]); ``` ## JSONC support - **`jsonc/parser(options)`** — JSONC parser (JSON with Comments). Same `charCodeAt` tokenizer as the standard parser, extended with `//` and `/* */` comments, trailing commas, and `whitespace` / `comment` / `comma` tokens. - Extra options: `streamWhitespace` (default: true), `streamComments` (default: true), `streamCommas` (default: false — emit a valueless `comma` token at every comma, separator or trailing, for faithful round-trip editing). - All standard parser options are supported. - **`jsonc/stringer(options)`** — JSONC stringer. Passes `whitespace` and `comment` tokens through verbatim. Extra option: `useCommas` (default: false — render streamed `comma` tokens as `,`, auto-inserting a separator only when no comma token arrived, so output stays valid even if commas were dropped upstream). - **`jsonc/verifier(options)`** — JSONC validator. Same `charCodeAt` validator as `Verifier`, accepting comments and trailing commas. Reports exact error position. ```js import {parser as jsoncParser} from 'stream-json/jsonc/parser.js'; import {stringer as jsoncStringer} from 'stream-json/jsonc/stringer.js'; chain([fs.createReadStream('settings.jsonc'), jsoncParser(), jsoncStringer(), destination]); ``` All existing filters, streamers, and utilities work with JSONC parser output — they ignore unknown tokens. JSONC also ships in both substrates: each has `asStream` (Node Duplex) and `asWebStream` (Web pair). Web entries: `stream-json/web/jsonc/{parser,stringer,verifier}.js`. ## File I/O (Node-only) — Since 3.3.0 - **`file/parseFile(options)`** — input-edge `gen()` stage that turns a path into a token stream. Returns `gen(asyncBlockReader(options), jsonParser(options))`. Drop at the head of a pipeline; drive with the path as the gen input value. Options: `readBlockSize` (default 64 KB) + all standard `parser()` options. Node-only (uses `node:fs/promises`). - **`file/verifyFile(path, options)`** — standalone async JSON validator. Returns `Promise`. Rejects with `{message, line, pos, offset}` on invalid input. Options: `readBlockSize` + `jsonStreaming`. - **`file/stringerToFile(path, options)`** — output-edge sink stage. Returns `gen(stringer(options), asyncBlockWriter(path, options))`. Drop at the tail; pipe MUST be driven through `pipe(...)` so the writer's flush closes the file. Options: `writeBlockSize` (default 1 MB) + all standard `stringer()` options. - **`core/utils/pipe(...stages)`** — one-shot single-value driver: builds a fresh `gen`, calls `g(value)` then `g(none)` so flushable sinks like `stringerToFile` actually flush. Generic, web-safe. - **`core/utils/drain(asyncGen)`** — drains any async iterable, returns the last yielded value (or `undefined`). Generic, web-safe. - JSONC variants under `file/jsonc/{parser,verifier,stringer}.js` — same shapes, comments + trailing commas supported. ```js import {parseFile} from 'stream-json/file/parser.js'; import {stringerToFile} from 'stream-json/file/stringer.js'; import {pipe} from 'stream-chain/utils/pipe.js'; import {drain} from 'stream-chain/utils/drain.js'; // file → tokens → file (round-trip) await drain(pipe(parseFile(), stringerToFile('out.json'))('in.json')); // validate a file import {verifyFile} from 'stream-json/file/verifier.js'; await verifyFile('candidate.json'); // throws {message, line, pos, offset} on invalid ``` Perf (Intel i3‑10110U, Node 26, 100 KB JSON): - Realistic parse-with-work (counter inside the pipeline; `bench/parse-count.js`): `pipe(parseFile(), counter)` ≈ 9.4 ms vs idiomatic `chain([createReadStream, parser()]) + on('data', counter)` ≈ 15.8 ms — **~68% faster**. gen() and chain() executors are within noise of each other. - Round-trip (`bench/file-roundtrip.js`): `pipe(parseFile(), stringerToFile())` ≈ 30 ms vs idiomatic chain + `createWriteStream` ≈ 50 ms — **~1.6× faster**. - Verify: within noise of the idiomatic chain. ## Common patterns ### Stream a huge JSON array ```js chain([ fs.createReadStream('huge-array.json'), parser(), streamArray(), ({value}) => processItem(value) ]); ``` ### Pick and filter nested data ```js chain([ fs.createReadStream('data.json'), parser(), pick({filter: 'results'}), streamArray(), ({value}) => value.active ? value : chain.none ]); ``` ### Edit JSON and write back ```js chain([ fs.createReadStream('input.json'), parser(), ignore({filter: /\bsecret\b/}), Stringer.make(), fs.createWriteStream('output.json') ]); ``` ## Token protocol The parser emits `{name, value}` tokens: `startObject`, `endObject`, `startArray`, `endArray`, `startKey`, `endKey`, `keyValue`, `startString`, `endString`, `stringChunk`, `stringValue`, `startNumber`, `endNumber`, `numberChunk`, `numberValue`, `nullValue`, `trueValue`, `falseValue`. These names are the closed `TokenName` type; `Token` is a discriminated union over `name` (narrowing on `token.name` tightens `token.value`). Both are exported from `stream-json/parser.js`. ## Links - Docs: https://github.com/uhop/stream-json/wiki - npm: https://www.npmjs.com/package/stream-json - Full LLM reference: https://github.com/uhop/stream-json/blob/master/llms-full.txt