---
title: dedup
description: Content-address object bodies by their hash so identical content is stored only once. Re-uploading the same bytes skips the upload, and copies share a single stored blob. No native dependencies; works on any adapter that supports metadata.
---

The built-in `dedup()` plugin stores each distinct body **once**. On `upload` it hashes the bytes (SHA-256), writes them a single time to a content-addressed blob under a store prefix (`.dedup/` by default), and leaves a tiny pointer at your logical key. Upload the same content again — under any key — and the byte upload is **skipped**; only the pointer is written.

```ts lineNumbers
import { createFiles } from "files-sdk";
import { s3 } from "files-sdk/s3";
import { dedup } from "files-sdk/dedup";

const files = createFiles({
  adapter: s3({ bucket: "uploads" }),
  plugins: [dedup()],
});

await files.upload("a.png", bytes);
await files.upload("b.png", bytes); // same content — no second byte upload
await files.copy("a.png", "c.png"); // shares the one stored blob
```

## How it works

The logical key (`a.png`) becomes an empty object whose `metadata` records the content hash; the bytes live at `.dedup/<sha256>`. Two keys with identical content point at the same blob, so the content is stored once no matter how many keys reference it.

- **`upload`** hashes the body, writes the blob only if that hash isn't already stored (`exists()`), then writes the pointer.
- **`download`** follows the pointer to the blob and returns it under your key — **ranges included**, because blobs are stored verbatim (unlike [`compression()`](/plugins/compression)).
- **`head` / `list`** report the logical content size with the internal fields stripped, without fetching the blob. List hides the blob store from normal listings.
- **`copy` / `move`** relocate the small pointer, so duplicating a de-duplicated file is near-free and the copy shares the original blob.

Bulk `upload([...])` / `download([...])` are de-duplicated per item. Objects without this plugin's marker — pre-existing, or written by another tool — **pass straight through** on read, so it's safe to enable on a bucket that already has data.

## Options

| Option   | Default    | What it does                                                  |
| -------- | ---------- | ------------------------------------------------------------- |
| `prefix` | `".dedup"` | Where the content-addressed blobs live. Hidden from `list()`. |

Objects under `prefix` are never themselves de-duplicated and are hidden from `list()` (unless you list within the prefix). Don't store your own data there.

## Ordering

Put `dedup()` **first**, before any body-transforming plugin. Encrypted bytes don't de-duplicate — a random per-object key makes identical inputs encrypt to different bytes — so de-duplication has to see the original content:

```ts
plugins: [dedup(), compression(), encryption(key)];
```

With this order the one stored blob is itself compressed and encrypted, and reads unwind the onion automatically (decrypt → decompress → follow the pointer).

## Things to keep in mind

- **It buffers the whole body** to hash it, so — like [`compression()`](/plugins/compression) — it's unsuitable for unknown-length [streaming uploads](/api/upload) and [resumable uploads](/resumable).
- **Reads cost a second fetch.** A download reads the pointer, then the blob; a ranged download does a `head` first. `head` and `list` add nothing — they only read the pointer.
- **It needs adapter metadata support.** The hash round-trips through `metadata`, the same gate a direct `metadata` upload hits.
- **`url()` and `signedUploadUrl()` fail closed.** A presigned GET would hand out the empty pointer, not the content, and a presigned PUT would write directly and bypass content-addressing — so both throw. Download through the `Files` instance instead.
- **Blobs aren't garbage-collected.** `delete` (and an overwrite) drop the pointer but leave the content addressed — so it's reused if the same bytes reappear. Reclaim unreferenced blobs with a storage lifecycle rule or a periodic sweep.
