# `datocms-structured-text-utils`

A set of Typescript types and helpers to work with DatoCMS Structured Text fields.

## Installation

Using [npm](http://npmjs.org/):

```sh
npm install datocms-structured-text-utils
```

Using [yarn](https://yarnpkg.com/):

```sh
yarn add datocms-structured-text-utils
```

## `dast` document validation

You can use the `validate()` function to check if an object is compatible with the [`dast` specification](https://www.datocms.com/docs/structured-text/dast):

```js
import { validate } from 'datocms-structured-text-utils';

const structuredText = {
  value: {
    schema: 'dast',
    document: {
      type: 'root',
      children: [
        {
          type: 'heading',
          level: 1,
          children: [
            {
              type: 'span',
              value: 'Hello!',
              marks: ['invalidmark'],
            },
          ],
        },
      ],
    },
  },
};

const result = validate(structuredText);

if (!result.valid) {
  console.error(result.message); // "span has an invalid mark "invalidmark"
}
```

## `dast` format specs

The package exports a number of constants that represents the rules of the [`dast` specification](https://www.datocms.com/docs/structured-text/dast).

Take a look a the [definitions.ts](https://github.com/datocms/structured-text/blob/main/packages/utils/src/definitions.ts) file for their definition:

```javascript
const blockquoteNodeType = 'blockquote';
const blockNodeType = 'block';
const codeNodeType = 'code';
const headingNodeType = 'heading';
const inlineItemNodeType = 'inlineItem';
const itemLinkNodeType = 'itemLink';
const linkNodeType = 'link';
const listItemNodeType = 'listItem';
const listNodeType = 'list';
const paragraphNodeType = 'paragraph';
const rootNodeType = 'root';
const spanNodeType = 'span';

const allowedNodeTypes = [
  'paragraph',
  'list',
  // ...
];

const allowedChildren = {
  paragraph: 'inlineNodes',
  list: ['listItem'],
  // ...
};

const inlineNodeTypes = [
  'span',
  'link',
  // ...
];

const allowedAttributes = {
  heading: ['level', 'children'],
  // ...
};

const allowedMarks = [
  'strong',
  'code',
  // ...
];
```

## Typescript Types

The package exports Typescript types for all the different nodes that a [`dast` document](https://www.datocms.com/docs/structured-text/dast) can contain.

Take a look a the [types.ts](https://github.com/datocms/structured-text/blob/main/packages/utils/src/types.ts) file for their definition:

```typescript
type Node
type BlockNode
type InlineNode
type RootType
type Root
type ParagraphType
type Paragraph
type HeadingType
type Heading
type ListType
type List
type ListItemType
type ListItem
type CodeType
type Code
type BlockquoteType
type Blockquote
type BlockType
type Block
type SpanType
type Mark
type Span
type LinkType
type Link
type ItemLinkType
type ItemLink
type InlineItemType
type InlineItem
type WithChildrenNode
type Document
type NodeType
type CdaStructuredTextValue
type Record
```

## Typescript Type guards

It also exports all a number of [type guards](https://www.typescriptlang.org/docs/handbook/advanced-types.html#user-defined-type-guards) that you can use to guarantees the type of a node in some scope.

Take a look a the [guards.ts](https://github.com/datocms/structured-text/blob/main/packages/utils/src/guards.ts) file for their definition:

```typescript
function hasChildren(node: Node): node is WithChildrenNode {}
function isInlineNode(node: Node): node is InlineNode {}
function isHeading(node: Node): node is Heading {}
function isSpan(node: Node): node is Span {}
function isRoot(node: Node): node is Root {}
function isParagraph(node: Node): node is Paragraph {}
function isList(node: Node): node is List {}
function isListItem(node: Node): node is ListItem {}
function isBlockquote(node: Node): node is Blockquote {}
function isBlock(node: Node): node is Block {}
function isCode(node: Node): node is Code {}
function isLink(node: Node): node is Link {}
function isItemLink(node: Node): node is ItemLink {}
function isInlineItem(node: Node): node is InlineItem {}
function isCdaStructuredTextValue(
  object: any,
): object is CdaStructuredTextValue {}
```

### Narrowing blocks by model

When your DAST tree has been fetched with typed responses (eg. the CMA client in `nested: true` mode), `block.item` / `inlineBlock.item` is a union of all possible block-model shapes. `isBlockWithItemOfType` and `isInlineBlockWithItemOfType` filter that union down to a single model and narrow the node accordingly.

Both guards support two call styles:

- Curried — `isBlockWithItemOfType(itemTypeId)` returns a predicate, handy with `findFirstNode` / `findAllNodes` / `Array#filter`.
- Direct — `isBlockWithItemOfType(itemTypeId, node)` checks a node inline (e.g. inside an `if`).

```typescript
import {
  findFirstNode,
  isBlockWithItemOfType,
  isInlineBlockWithItemOfType,
} from 'datocms-structured-text-utils';

const WARNING_BLOCK_TYPE_ID = 'abc123' as const;
const CALLOUT_BLOCK_TYPE_ID = 'def456' as const;

// Curried — block
const needle = findFirstNode(
  body.document,
  isBlockWithItemOfType(WARNING_BLOCK_TYPE_ID),
);

if (needle) {
  // needle.node.item is narrowed to the Warning block-model shape
  console.log(needle.node.item.attributes.message);
}

// Direct — block
if (isBlockWithItemOfType(WARNING_BLOCK_TYPE_ID, node)) {
  console.log(node.item.attributes.message);
}

// Same shape for inline blocks
const callout = findFirstNode(
  body.document,
  isInlineBlockWithItemOfType(CALLOUT_BLOCK_TYPE_ID),
);

if (callout) {
  // callout.node.item is narrowed to the Callout inline-block-model shape
  console.log(callout.node.item.attributes.label);
}

if (isInlineBlockWithItemOfType(CALLOUT_BLOCK_TYPE_ID, node)) {
  console.log(node.item.attributes.label);
}
```

Pass the `itemTypeId` as a literal (`as const` on pre-set constants) for narrowing to kick in. At runtime the guards walk `item.relationships.item_type.data.id`, so they work for any block item carrying that shape — CMA nested-mode responses and the object variants of request payloads. Bare string IDs (used in request payloads to reference unchanged blocks) are filtered out.

## Tree Manipulation Utilities

The package provides a comprehensive set of utilities for traversing, transforming, and querying structured text trees. All utilities support both synchronous and asynchronous operations, work with both document wrappers and plain nodes, and provide full TypeScript support with proper type narrowing.

### Visiting Nodes

| Function                                                                                                           | Description                                                           |
| ------------------------------------------------------------------------------------------------------------------ | --------------------------------------------------------------------- |
| [`forEachNode`](https://github.com/datocms/structured-text/blob/main/packages/utils/src/manipulation.ts#L111)      | Visit every node in the tree synchronously using pre-order traversal  |
| [`forEachNodeAsync`](https://github.com/datocms/structured-text/blob/main/packages/utils/src/manipulation.ts#L145) | Visit every node in the tree asynchronously using pre-order traversal |

Visit all nodes in the tree using pre-order traversal:

```javascript
import { forEachNode, forEachNodeAsync } from 'datocms-structured-text-utils';

// Synchronous traversal
forEachNode(structuredText, (node, parent, path) => {
  console.log(`Node type: ${node.type}, Path: ${path.join('.')}`);
});

// Asynchronous traversal
await forEachNodeAsync(structuredText, async (node, parent, path) => {
  await processNode(node);
});
```

### Transforming Trees

| Function                                                                                                        | Description                                                        |
| --------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------ |
| [`mapNodes`](https://github.com/datocms/structured-text/blob/main/packages/utils/src/manipulation.ts#L309)      | Transform nodes in the tree synchronously (1:1, splat, or remove)  |
| [`mapNodesAsync`](https://github.com/datocms/structured-text/blob/main/packages/utils/src/manipulation.ts#L355) | Transform nodes in the tree asynchronously (1:1, splat, or remove) |

`mapNodes` walks the tree **bottom-up**: when the mapper sees a node, its
descendants have already been transformed, and the mapper's return for that
node is final.

The mapper may return:

- **a single node** — replaces the input node 1:1
- **an array of nodes** — splatted into the parent's children (1:N)
- **`null` or `undefined`** — removes the node from its parent (1:0)

Returning an array or nullish for the root node throws, since the function
returns a single node.

```javascript
import {
  mapNodes,
  mapNodesAsync,
  isHeading,
  isSpan,
  isThematicBreak,
} from 'datocms-structured-text-utils';

// 1:1 — transform heading levels for better hierarchy
const enhanced = mapNodes(structuredText, (node) => {
  if (isHeading(node) && node.level === 1) {
    return { ...node, level: 2 };
  }
  return node;
});

// 1:N — split a span into a span + a link by returning an array
const linked = mapNodes(structuredText, (node) => {
  if (!isSpan(node)) return node;
  const parts = node.value.split(/(\bclick here\b)/);
  return parts
    .filter((part) => part)
    .map((part) =>
      part === 'click here'
        ? {
            type: 'link',
            url: '/target',
            children: [{ type: 'span', value: 'click here' }],
          }
        : { type: 'span', value: part },
    );
});

// 1:0 — drop nodes by returning null
const compact = mapNodes(structuredText, (node) =>
  isThematicBreak(node) ? null : node,
);

// Async transformation with external API calls
const processed = await mapNodesAsync(structuredText, async (node) => {
  if (isSpan(node) && node.value.includes('TODO')) {
    const updatedText = await translateText(node.value);
    return { ...node, value: updatedText };
  }
  return node;
});
```

### Finding Nodes

| Function                                                                                                             | Description                                                  |
| -------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------ |
| [`collectNodes`](https://github.com/datocms/structured-text/blob/main/packages/utils/src/manipulation.ts#L402)       | Collect all nodes that match a predicate function            |
| [`collectNodesAsync`](https://github.com/datocms/structured-text/blob/main/packages/utils/src/manipulation.ts#L460)  | Collect all nodes that match an async predicate function     |
| [`findFirstNode`](https://github.com/datocms/structured-text/blob/main/packages/utils/src/manipulation.ts#L499)      | Find the first node that matches a predicate function        |
| [`findFirstNodeAsync`](https://github.com/datocms/structured-text/blob/main/packages/utils/src/manipulation.ts#L577) | Find the first node that matches an async predicate function |

Find specific nodes using predicates or type guards:

```javascript
import {
  findFirstNode,
  findFirstNodeAsync,
  collectNodes,
  collectNodesAsync,
  isSpan,
  isHeading,
} from 'datocms-structured-text-utils';

// Find first node matching condition
const firstHeading = findFirstNode(structuredText, isHeading);
if (firstHeading) {
  console.log(`Found heading: ${firstHeading.node.level}`);
}

// Collect all nodes matching condition
const allSpans = collectNodes(structuredText, isSpan);
const textContent = allSpans.map(({ node }) => node.value).join('');

// Find nodes with specific attributes
const strongText = collectNodes(
  structuredText,
  (node) => isSpan(node) && node.marks?.includes('strong'),
);
```

### Filtering Trees

| Function                                                                                                           | Description                                             |
| ------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------- |
| [`filterNodes`](https://github.com/datocms/structured-text/blob/main/packages/utils/src/manipulation.ts#L626)      | Remove nodes that don't match a predicate synchronously |
| [`filterNodesAsync`](https://github.com/datocms/structured-text/blob/main/packages/utils/src/manipulation.ts#L709) | Remove nodes that don't match an async predicate        |

Remove nodes that don't match a predicate:

```javascript
import {
  filterNodes,
  filterNodesAsync,
  isCode,
  isBlock,
} from 'datocms-structured-text-utils';

// Remove all code blocks
const withoutCode = filterNodes(structuredText, (node) => !isCode(node));

// Async filtering with external validation
const validated = await filterNodesAsync(structuredText, async (node) => {
  if (isBlock(node)) {
    return await validateBlockItem(node.item);
  }
  return true;
});
```

### Reducing Trees

| Function                                                                                                           | Description                                                            |
| ------------------------------------------------------------------------------------------------------------------ | ---------------------------------------------------------------------- |
| [`reduceNodes`](https://github.com/datocms/structured-text/blob/main/packages/utils/src/manipulation.ts#L796)      | Reduce the tree to a single value using a synchronous reducer function |
| [`reduceNodesAsync`](https://github.com/datocms/structured-text/blob/main/packages/utils/src/manipulation.ts#L841) | Reduce the tree to a single value using an async reducer function      |

Reduce the entire tree to a single value:

```javascript
import { reduceNodes, reduceNodesAsync } from 'datocms-structured-text-utils';

// Extract all text content
const textContent = reduceNodes(
  structuredText,
  (acc, node) => {
    if (isSpan(node)) {
      return acc + node.value;
    }
    return acc;
  },
  '',
);

// Count nodes by type
const nodeCounts = reduceNodes(
  structuredText,
  (acc, node) => {
    acc[node.type] = (acc[node.type] || 0) + 1;
    return acc;
  },
  {},
);
```

### Checking Conditions

| Function                                                                                                         | Description                                                                           |
| ---------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------- |
| [`someNode`](https://github.com/datocms/structured-text/blob/main/packages/utils/src/manipulation.ts#L883)       | Check if any node in the tree matches a predicate (short-circuit evaluation)          |
| [`someNodeAsync`](https://github.com/datocms/structured-text/blob/main/packages/utils/src/manipulation.ts#L925)  | Check if any node in the tree matches an async predicate (short-circuit evaluation)   |
| [`everyNode`](https://github.com/datocms/structured-text/blob/main/packages/utils/src/manipulation.ts#L967)      | Check if every node in the tree matches a predicate (short-circuit evaluation)        |
| [`everyNodeAsync`](https://github.com/datocms/structured-text/blob/main/packages/utils/src/manipulation.ts#L998) | Check if every node in the tree matches an async predicate (short-circuit evaluation) |

Test if any or all nodes match a condition:

```javascript
import {
  someNode,
  everyNode,
  someNodeAsync,
  everyNodeAsync,
  isHeading,
  isSpan,
  isBlock,
} from 'datocms-structured-text-utils';

// Check if document contains any headings
const hasHeadings = someNode(structuredText, isHeading);

// Check if all spans have text content
const allSpansHaveText = everyNode(
  structuredText,
  (node) => !isSpan(node) || (node.value && node.value.length > 0),
);

// Async validation
const allBlocksValid = await everyNodeAsync(
  structuredText,
  async (node) => !isBlock(node) || (await validateBlock(node.item)),
);
```

### Type Safety and Path Information

All utilities provide full TypeScript support with type narrowing and path information:

```typescript
// Type guards automatically narrow types
const headings = collectNodes(structuredText, isHeading);
// headings is now Array<{ node: Heading; path: TreePath }>

headings.forEach(({ node, path }) => {
  // TypeScript knows node is Heading type
  console.log(`Level ${node.level} heading at ${path.join('.')}`);
});

// Custom type guards work too
const strongSpans = collectNodes(
  structuredText,
  (node): node is Span => isSpan(node) && node.marks?.includes('strong'),
);
// strongSpans is now Array<{ node: Span; path: TreePath }>
```

## Tree Visualization with Inspector

The package includes a powerful tree visualization utility that renders structured text documents as ASCII trees, making it easy to debug and understand document structure during development.

### Basic Usage

| Function                                                                                               | Description                                                |
| ------------------------------------------------------------------------------------------------------ | ---------------------------------------------------------- |
| [`inspect`](https://github.com/datocms/structured-text/blob/main/packages/utils/src/inspector.ts#L202) | Render a structured text document or node as an ASCII tree |

```javascript
import { inspect } from 'datocms-structured-text-utils';

const structuredText = {
  schema: 'dast',
  document: {
    type: 'root',
    children: [
      {
        type: 'heading',
        level: 1,
        children: [{ type: 'span', value: 'Main Title' }],
      },
      {
        type: 'paragraph',
        children: [
          { type: 'span', value: 'This is a ' },
          { type: 'span', marks: ['strong'], value: 'bold' },
          { type: 'span', value: ' paragraph.' },
        ],
      },
      {
        type: 'block',
        item: 'block-123',
      },
    ],
  },
};

console.log(inspect(structuredText));
```

**Output:**

```
├ heading (level: 1)
│ └ span "Main Title"
├ paragraph
│ ├ span "This is a "
│ ├ span (marks: strong) "bold"
│ └ span " paragraph."
└ block (item: "block-123")
```

### Custom Block Formatting

The inspector supports custom formatting for block and inline block nodes, allowing you to display rich information about embedded content:

```javascript
import { inspect } from 'datocms-structured-text-utils';

// Example with block objects instead of just IDs
const blockObject = {
  id: 'block-456',
  type: 'item',
  attributes: {
    title: 'Hero Section',
    subtitle: 'Welcome to our site',
    buttonText: 'Get Started',
  },
};

// Simple formatter
const tree = inspect(document, {
  blockFormatter: (item, maxWidth) => {
    if (typeof item === 'string') return `ID: ${item}`;
    return `id: ${item.id}\ntitle: ${item.attributes.title}`;
  },
});

console.log(tree);
```

**Output:**

```
├ paragraph
│ └ span "Content before block"
├ block
│ id: 456
│ title: Hero Section
└ paragraph
  └ span "Content after block"
```
