# Datasets overview

**Added in:** `@mastra/core@1.4.0`

Datasets are collections of test cases that you run experiments against to measure how well your agents and workflows perform. Each mutation creates a new version, so you can reproduce past experiments exactly. Pair datasets with [scorers](https://mastra.ai/docs/evals/overview) to track quality across prompts, models, or code changes.

## Usage

### Configure storage

Configure storage in your Mastra instance. Datasets require a storage adapter that provides the `datasets` domain:

```typescript
import { Mastra } from '@mastra/core'
import { LibSQLStore } from '@mastra/libsql'

export const mastra = new Mastra({
  storage: new LibSQLStore({
    id: 'my-store',
    url: 'file:./mastra.db',
  }),
})
```

### Accessing the datasets API

All dataset operations are available through `mastra.datasets`:

```typescript
const datasets = mastra.datasets

// Create a dataset
const dataset = await datasets.create({ name: 'my-dataset' })

// Retrieve an existing dataset
const existing = await datasets.get({ id: 'dataset-id' })

// List all datasets
const { datasets: all } = await datasets.list()
```

> **Info:** Visit the [`DatasetsManager` reference](https://mastra.ai/reference/datasets/datasets-manager) for the full list of methods.

## Studio

You can also manage datasets in [Studio](https://mastra.ai/docs/studio/overview). After opening Studio, select **Datasets** from the sidebar to see all your available datasets or create a new one.

To get started, select **Create Dataset** and set a name, description, and optional schemas. After confirming, you'll see the dataset details page with two tabs: **Items** and [**Experiments**](https://mastra.ai/docs/evals/datasets/running-experiments).

In the **Items** view you can add, update, and delete items, and view version history. Select **Add Item** to insert a new item with JSON editors for input and ground truth. From this view you can also import items in bulk from a CSV or JSON file. When importing, map each column to the corresponding dataset field.

Select **Versions** to see the full history of changes to the dataset. After selecting **Compare Versions**, choose any two versions and select **Compare** to see a side-by-side diff of all items that were added, changed, or removed between those versions.

## Creating a dataset

Call [`create()`](https://mastra.ai/reference/datasets/create) with a name and optional description:

```typescript
import { mastra } from '../index'

const dataset = await mastra.datasets.create({
  name: 'translation-pairs',
  description: 'English to Spanish translation test cases',
})

console.log(dataset.id) // auto-generated UUID
```

### Defining schemas

You can enforce the shape of `input` and `groundTruth` by passing Zod schemas. Mastra converts them to JSON Schema at creation time:

```typescript
import { z } from 'zod'
import { mastra } from '../index'

const dataset = await mastra.datasets.create({
  name: 'translation-pairs',
  inputSchema: z.object({
    text: z.string(),
    sourceLang: z.string(),
    targetLang: z.string(),
  }),
  groundTruthSchema: z.object({
    translation: z.string(),
  }),
})
```

Items that don't match the schema are rejected at insert time.

## Adding items

Use [`addItem()`](https://mastra.ai/reference/datasets/addItem) for a single item or [`addItems()`](https://mastra.ai/reference/datasets/addItems) to insert in bulk:

```typescript
// Single item
await dataset.addItem({
  input: { text: 'Hello', sourceLang: 'en', targetLang: 'es' },
  groundTruth: { translation: 'Hola' },
})

// Bulk insert
await dataset.addItems({
  items: [
    {
      input: { text: 'Goodbye', sourceLang: 'en', targetLang: 'es' },
      groundTruth: { translation: 'Adiós' },
    },
    {
      input: { text: 'Thank you', sourceLang: 'en', targetLang: 'es' },
      groundTruth: { translation: 'Gracias' },
    },
  ],
})
```

## Updating and deleting items

[`updateItem()`](https://mastra.ai/reference/datasets/updateItem), [`deleteItem()`](https://mastra.ai/reference/datasets/deleteItem), and [`deleteItems()`](https://mastra.ai/reference/datasets/deleteItems) let you modify or remove existing items by `itemId`:

```typescript
await dataset.updateItem({
  itemId: 'item-abc-123',
  groundTruth: { translation: '¡Hola!' },
})

await dataset.deleteItem({ itemId: 'item-abc-123' })

await dataset.deleteItems({ itemIds: ['item-1', 'item-2'] })
```

## Listing and searching items

[`listItems()`](https://mastra.ai/reference/datasets/listItems) supports pagination and full-text search:

```typescript
// Paginated list
const { items, pagination } = await dataset.listItems({
  page: 0,
  perPage: 50,
})

// Full-text search
const { items: matches } = await dataset.listItems({
  search: 'Hello',
})

// List items at a specific version
const v2Items = await dataset.listItems({ version: 2 })
```

## Versioning

Every mutation to a dataset's items (add, update, or delete) bumps the dataset version. This lets you pin experiments to a specific snapshot of the data.

### Listing versions

Use [`listVersions()`](https://mastra.ai/reference/datasets/listVersions) to see the paginated history of versions:

```typescript
const { versions, pagination } = await dataset.listVersions()

for (const v of versions) {
  console.log(`Version ${v.version} — created ${v.createdAt}`)
}
```

### Viewing item history

See how a specific item changed across versions by calling [`getItemHistory()`](https://mastra.ai/reference/datasets/getItemHistory) with the `itemId`:

```typescript
const history = await dataset.getItemHistory({ itemId: 'item-abc-123' })

for (const row of history) {
  console.log(`Version ${row.datasetVersion}`, row.input, row.groundTruth)
}
```

### Pinning to a version

Fetch the exact items that existed at a past version:

```typescript
const items = await dataset.listItems({ version: 2 })
```

You can also pin experiments to a version, see [running experiments](https://mastra.ai/docs/evals/datasets/running-experiments).

> **Info:** Visit the [`Dataset` reference](https://mastra.ai/reference/datasets/dataset) for the full list of methods and parameters.

## Related

- [Running experiments](https://mastra.ai/docs/evals/datasets/running-experiments)
- [Scorers overview](https://mastra.ai/docs/evals/overview)
- [DatasetsManager reference](https://mastra.ai/reference/datasets/datasets-manager)
- [Dataset reference](https://mastra.ai/reference/datasets/dataset)