---
title: failover
description: Read and write the primary adapter, and fall back to one or more secondary adapters when a backend is down. A live, per-operation failover chain - body-transparent, no native dependencies.
---

The built-in `failover()` plugin keeps a `Files` instance serving while a backend is down. Every operation tries the **primary** first; if it throws because the backend is unreachable, the plugin retries against a **secondary** - and the next, and the next - until one succeeds. The primary is the instance's own adapter (reached through the rest of the pipeline); the secondaries are backup adapters you pass in.

It's **body-transparent**: it never buffers or transforms bytes, so streaming, [range downloads](/api/download), [`url()`](/api/url), and [`signedUploadUrl()`](/api/signed-upload-url) all keep working. It has **no native dependencies**, adds no methods (`wrap` only, so plain `new Files()` is enough), and works on any set of adapters.

```ts lineNumbers
import { Files } from "files-sdk";
import { s3 } from "files-sdk/s3";
import { failover } from "files-sdk/failover";

const files = new Files({
  adapter: s3({ bucket: "primary", region: "us-east-1" }), // primary
  plugins: [
    failover({
      secondaries: s3({ bucket: "backup", region: "us-west-2" }),
      onFailover: ({ operation, failed }) =>
        console.warn(`failover: ${operation} fell off backend ${failed}`),
    }),
  ],
});

await files.download("report.pdf"); // primary, or the backup if it's down
await files.upload("invoice.pdf", body); // lands on the first reachable backend
```

## Failover vs replication vs tiering

These three Tier B plugins all take a second adapter, but they do different jobs:

| Plugin                          | What it does                                                               |
| ------------------------------- | -------------------------------------------------------------------------- |
| `failover()`                    | Try the primary; fall back to a secondary **only when a backend is down**. |
| `replication()`                 | Write **every** mutation to all backends (fan-out).                        |
| [`tiering()`](/plugins/tiering) | **Partition** objects across backends by key / size / age.                 |

`failover()` treats each secondary as a **full replica** of one namespace, so it never splits or merges data across backends. It's the availability lever: keep serving against whatever backend is up.

## The failover chain

Pass one secondary or several. They're tried in order after the primary, forming the chain `[primary, ...secondaries]`:

```ts
failover({
  secondaries: [
    s3({ bucket: "backup-eu" }), // tried after the primary
    s3({ bucket: "backup-us" }), // tried after that
  ],
});
```

Each operation walks the chain until one backend succeeds. If **every** backend is down, the last error is thrown.

## When does it fail over?

By default, the plugin fails over **only** on a `Provider` [error](/errors) - a network failure, timeout, or 5xx, i.e. "the backend is down" - and **never** on an aborted request. A definitive answer from a _healthy_ backend is surfaced as-is, not masked by probing a replica:

- a `NotFound` stays a `NotFound` - a genuine 404 isn't turned into a slow scan of every replica;
- an `Unauthorized` / `Conflict` / `ReadOnly` is likewise a real answer, not a reason to try elsewhere.

This keeps reads honest: the primary is the source of truth, and the secondary only answers when the primary can't.

### Customising the predicate

Pass `shouldFailover` to change the rule. It receives the error normalized to a [`FilesError`](/errors) (so `code` and `aborted` are always set). For example, to **read through** to a replica on a miss - useful when the secondary is a live mirror that may be ahead of the primary:

```ts
failover({
  secondaries: replica,
  shouldFailover: (error) =>
    error.code === "NotFound" || error.code === "Provider",
});
```

<Callout type="warn">
  Failing over on `NotFound` means a `delete` that only reached one backend can
  be "resurrected" on the next read from another. Reach for it only when your
  secondaries are genuine replicas kept in sync.
</Callout>

## What each verb does

- **`download` / `head` / `url` / `exists`** read from the first reachable backend.
- **`upload` / `delete` / `copy` / `move`** run against the first reachable backend. They are **not** fanned out to every backend - that's [`replication()`](#failover-vs-replication-vs-tiering).
- **`list`** returns the first reachable backend's page. It is **not** merged across backends (each secondary is a full replica, so there's nothing to interleave).
- **`signedUploadUrl`** signs against the first reachable backend.

[Bulk](/plugins#bulk-operations-too) calls fan out to one operation per item, so each element fails over independently.

### Streaming uploads

A `ReadableStream` body is **read-once** - once the primary has consumed it, there's nothing left to replay against a backup. So a streaming `upload` runs against the **primary alone** and isn't failed over; if the primary is down, the upload fails. Every other body (a string, `Blob`, `File`, `ArrayBuffer`, or typed array) re-reads, so it fails over normally. Buffer a stream up front if you need a streaming upload to survive a primary outage.

## Observing failovers

`onFailover` fires (fire-and-forget) each time an operation moves to the next backend - wire it to your metrics or alerting to learn a backend is degraded:

```ts lineNumbers
failover({
  secondaries: [backupA, backupB],
  onFailover: ({ operation, failed, next, error }) => {
    metrics.increment("storage.failover", { operation, from: failed });
    log.warn(`backend ${failed} failed ${operation}: ${error.message}`);
  },
});
```

`failed` and `next` are indices into `[primary, ...secondaries]` - `0` is the primary, `1` the first secondary, and so on. A throw from the handler is swallowed, so it can never break the operation.

## Consistency: availability, not convergence

Failover buys **availability**, not consistency. An object written to a secondary while the primary was down lives **only** on that secondary; once the primary recovers, a read hits it first and gets a `NotFound`. Failover doesn't reconcile that gap for you. To converge:

- keep the secondary current with `replication()` (write-through to both), or
- reconcile after an outage with [`sync`](/sync) / [`transfer`](/transfer), or
- pass a `shouldFailover` that also fails over on `NotFound` so reads fall through to the replica.

## Ordering and prefixes

- **Place it last (innermost).** Body-transforming plugins like [`encryption()`](/plugins/encryption) and [`compression()`](/plugins/compression) wrap `failover()` and transform the op on the way in, so the same bytes reach **every** backend:

  ```ts
  plugins: [encryption(key), failover({ secondaries: backup })];
  ```

- **Address objects by caller-facing keys.** Each secondary does **not** receive the instance `prefix`, so give it its own bucket / container and avoid a client `prefix` on a failover instance.

## Things to keep in mind

- **Secondaries are real stores.** A failed-over read pays the secondary's latency, and it must actually hold the object (keep it in sync with `replication()` / `sync`).
- **The primary is the source of truth.** With the default predicate, a healthy primary's `NotFound` is returned without consulting any replica.
- **Streaming uploads don't fail over.** A `ReadableStream` can't be replayed; buffer it first if it must survive a primary outage.
