
# Recall Memories

Retrieve memories from a bank using multi-strategy recall.

When you **recall**, Hindsight runs four retrieval strategies in parallel — semantic similarity, keyword (BM25), graph traversal, and temporal — then fuses and reranks the results into a single ranked list. The response contains structured facts, not raw documents.

{/* Import raw source files */}

:::info How Recall Works
Learn about the four retrieval strategies (semantic, keyword, graph, temporal) and RRF fusion in the [Recall Architecture](../retrieval.md) guide.
> **💡 Prerequisites**
> 
Make sure you've completed the [Quick Start](./quickstart) to install the client and start the server.
## Basic Recall

### Python

```python
response = client.recall(bank_id="my-bank", query="What does Alice do?")

# response.results is a list of RecallResult objects, each with:
# - id:             fact ID
# - text:           the extracted fact
# - type:           "world", "experience", or "observation"
# - context:        context label set during retain
# - metadata:       dict[str, str] set during retain
# - tags:           list of tags
# - entities:       list of entity name strings linked to this fact
# - occurred_start: ISO datetime of when the event started
# - occurred_end:   ISO datetime of when the event ended
# - mentioned_at:   ISO datetime of when the fact was retained
# - document_id:    document this fact belongs to
# - chunk_id:       chunk this fact was extracted from

# Example response.results:
# [
#   RecallResult(id="a1b2...", text="Alice works at Google as a software engineer", type="world", context="career", ...),
#   RecallResult(id="c3d4...", text="Alice got promoted to senior engineer", type="experience", occurred_start="2024-03-15T00:00:00Z", ...),
# ]
```

### Node.js

```javascript
const response = await client.recall('my-bank', 'What does Alice do?');

// response.results is an array of result objects, each with:
// - id:            fact ID
// - text:          the extracted fact
// - type:          "world", "experience", or "observation"
// - context:       context label set during retain
// - metadata:      Record<string, string> set during retain
// - tags:          string[] of tags
// - entities:      string[] of entity names linked to this fact
// - occurredStart: ISO datetime of when the event started
// - occurredEnd:   ISO datetime of when the event ended
// - mentionedAt:   ISO datetime of when the fact was retained
// - documentId:    document this fact belongs to
// - chunkId:       chunk this fact was extracted from

// Example response.results:
// [
//   { id: "a1b2...", text: "Alice works at Google as a software engineer", type: "world", context: "career", ... },
//   { id: "c3d4...", text: "Alice got promoted to senior engineer", type: "experience", occurredStart: "2024-03-15T00:00:00Z", ... },
// ]
```

### CLI

```bash
hindsight memory recall my-bank "What does Alice do?"
```

### Go

```go
# Section 'recall-basic' not found in api/recall.go
```

---

## Parameters

### query

The natural language question or statement to search for. This is the only required field. The query drives all four retrieval strategies simultaneously: it is embedded for semantic search, tokenized for BM25 keyword search, used to seed graph traversal, and parsed for temporal expressions. After retrieval, the raw query text is also passed to the cross-encoder reranker to re-score every candidate. Queries exceeding 500 tokens are rejected.

### types

Controls which categories of memory facts are searched. Accepted values are `world` (objective facts), `experience` (events and conversations), and `observation` (deduplicated, evidence-grounded beliefs consolidated from multiple memories). When omitted, all three types are searched.

Each type runs the full four-strategy retrieval pipeline independently, so narrowing `types` reduces both the result set and query cost.

### Python

```python
# Only world facts (objective information)
world_facts = client.recall(
    bank_id="my-bank",
    query="Where does Alice work?",
    types=["world"]
)
```
```python
# Only experience (conversations and events)
experience = client.recall(
    bank_id="my-bank",
    query="What have I recommended?",
    types=["experience"]
)
```
```python
# Only observations (consolidated knowledge)
observations = client.recall(
    bank_id="my-bank",
    query="What patterns have I learned?",
    types=["observation"]
)
```

### Node.js

```javascript
await client.recall('my-bank', 'query', { types: ['world'] });
```
```javascript
await client.recall('my-bank', 'query', { types: ['experience'] });
```
```javascript
await client.recall('my-bank', 'query', { types: ['observation'] });
```

### CLI

```bash
hindsight memory recall my-bank "query" --fact-type world,observation
```

### Go

```go
# Section 'recall-world-only' not found in api/recall.go
```
```go
# Section 'recall-experience-only' not found in api/recall.go
```
```go
# Section 'recall-observations-only' not found in api/recall.go
```

> **💡 About Observations**
> 
Observations are deduplicated, evidence-grounded beliefs consolidated from multiple facts — preferences, recurring patterns, and durable learnings the memory bank has built up. Each observation references its supporting memories (with exact quotes) and carries a computed freshness trend, and is refined rather than overwritten when new evidence arrives. They are created and maintained automatically in the background after retain operations.
### budget

Controls retrieval depth and breadth. Accepted values are `low`, `mid` (default), and `high`. Use `low` for fast simple lookups, `mid` for balanced everyday queries, and `high` when you need to find indirect connections or exhaustive coverage.

### Python

```python
# Quick lookup
results = client.recall(bank_id="my-bank", query="Alice's email", budget="low")

# Deep exploration
results = client.recall(bank_id="my-bank", query="How are Alice and Bob connected?", budget="high")
```

### Node.js

```javascript
// Quick lookup
const quickResults = await client.recall('my-bank', "Alice's email", { budget: 'low' });

// Deep exploration
const deepResults = await client.recall('my-bank', 'How are Alice and Bob connected?', { budget: 'high' });
```

### CLI

```bash
# Quick lookup
hindsight memory recall my-bank "Alice's email" --budget low

# Deep exploration
hindsight memory recall my-bank "How are Alice and Bob connected?" --budget high
```

### Go

```go
# Section 'recall-budget-levels' not found in api/recall.go
```

### max_tokens

The maximum number of tokens the returned facts can collectively occupy. Defaults to `4096`. Only the `text` field of each fact is counted toward this budget — metadata, tags, entities, and other fields are not included. After reranking, facts are included in relevance order until this budget is exhausted — so you always get the most relevant memories that fit. Hindsight is designed for agents, which think in tokens rather than result counts: set `max_tokens` to however much of your context window you want to allocate to memories.

### Python

```python
# Fill up to 4K tokens of context with relevant memories
results = client.recall(bank_id="my-bank", query="What do I know about Alice?", max_tokens=4096)

# Smaller budget for quick lookups
results = client.recall(bank_id="my-bank", query="Alice's email", max_tokens=500)
```

### Node.js

```javascript
// Fill up to 4K tokens of context with relevant memories
await client.recall('my-bank', 'What do I know about Alice?', { maxTokens: 4096 });

// Smaller budget for quick lookups
await client.recall('my-bank', "Alice's email", { maxTokens: 500 });
```

### CLI

```bash
# Fill up to 4K tokens of context with relevant memories
hindsight memory recall my-bank "What do I know about Alice?" --max-tokens 4096

# Smaller budget for quick lookups
hindsight memory recall my-bank "Alice's email" --max-tokens 500
```

### Go

```go
# Section 'recall-token-budget' not found in api/recall.go
```

### query_timestamp

An ISO 8601 datetime representing when the query is being asked, from the user's perspective. When provided, it is used as the anchor for resolving relative temporal expressions in the query — for example, if the query says "last month" and `query_timestamp` is `2023-05-30`, the temporal search window becomes approximately April 2023. Without it, the server's current time is used as the anchor. This field matters most for replaying historical conversations or building agents that need time-anchored recall.

### include

An optional object controlling supplementary data returned alongside the main facts.

#### chunks

When enabled, the response includes the raw source text chunks from which each fact was extracted. Chunks are fetched before the `max_tokens` filter, so setting `max_tokens=0` returns no facts but can still return chunks. The `max_tokens` sub-option (default `8192`) controls the total chunk token budget independently of the main fact budget. This is useful when agents need surrounding context beyond the extracted fact text.

:::note
When `include_chunks` is enabled, chunks are fetched based on the top-scored reranked results before token filtering. The last chunk is truncated (not dropped) to fit exactly within the budget, and each chunk carries a `truncated` flag indicating whether it was cut.
#### source_facts

When enabled and `types` includes `observation`, each observation result is accompanied by the original contributing facts it was synthesized from. Source facts are returned in a top-level `source_facts` dict keyed by fact ID, and each observation result carries a `source_fact_ids` list for cross-referencing. Facts are deduplicated across observations. The `max_tokens` sub-option (default `4096`) limits the total token budget for source facts.

### Python

```python
# Recall observations and include their source facts
response = client.recall(
    bank_id="my-bank",
    query="What patterns have I learned about Alice?",
    types=["observation"],
    include_source_facts=True,
    max_source_facts_tokens=4096,
)

for obs in response.results:
    print(f"Observation: {obs.text}")
    if obs.source_fact_ids and response.source_facts:
        print("  Derived from:")
        for fact_id in obs.source_fact_ids:
            fact = response.source_facts.get(fact_id)
            if fact:
                print(f"    - [{fact.type}] {fact.text}")
```

### Node.js

```javascript
// Recall observations and include their source facts
const obsResponse = await client.recall('my-bank', 'What patterns have I learned about Alice?', {
    types: ['observation'],
    includeSourceFacts: true,
    maxSourceFactsTokens: 4096,
});

for (const obs of obsResponse.results) {
    console.log(`Observation: ${obs.text}`);
    if (obs.source_fact_ids && obsResponse.source_facts) {
        console.log('  Derived from:');
        for (const factId of obs.source_fact_ids) {
            const fact = obsResponse.source_facts[factId];
            if (fact) console.log(`    - [${fact.type}] ${fact.text}`);
        }
    }
}
```

### CLI

```bash
# Recall observations with source facts
hindsight memory recall my-bank "What patterns have I learned about Alice?" \
  --fact-type observation
```

### Go

```go
# Section 'recall-source-facts' not found in api/recall.go
```

#### entities

Enabled by default. When active, each returned fact includes the canonical names of entities associated with it. Set to `null` to skip the entity JOIN query and reduce response size. The `max_tokens` sub-option (default `500`) is a future-facing guard for entity data.

### tags

Filters recall to only memories that match the specified tags. When omitted, all memories regardless of tags are eligible. Tag filtering is applied at the database level across all four retrieval strategies, not as a post-processing step.

The `tags_match` parameter controls the filtering logic:

| Mode | Untagged memories | Match condition |
|------|-------------------|-----------------|
| `any` (default) | Included | Memory has **at least one** of the specified tags |
| `any_strict` | Excluded | Memory has **at least one** of the specified tags |
| `all` | Included | Memory has **all** of the specified tags |
| `all_strict` | Excluded | Memory has **all** of the specified tags |

#### Scenario setup

Consider a bank with these four memories:

| Memory | Tags |
|--------|------|
| "Alice prefers async communication" | `["user:alice"]` |
| "Bob dislikes long meetings" | `["user:bob"]` |
| "Team uses Slack for announcements" | `["user:alice", "team"]` |
| "Company policy: no meetings on Fridays" | *(untagged)* |

#### `any` — OR matching, includes untagged (default)

Returns memories that have **at least one** matching tag, plus untagged memories.

### Python

```python
response = client.recall(
    bank_id="my-bank",
    query="communication preferences",
    tags=["user:alice"],
    tags_match="any",  # default
)
# Returns:
#   [match]    "Alice prefers async communication"     — has "user:alice"
#   [no match] "Bob dislikes long meetings"             — no overlap with ["user:alice"]
#   [match]    "Team uses Slack for announcements"      — has "user:alice"
#   [match]    "Company policy: no meetings on Fridays" — untagged, included by default
```

### Node.js

```javascript
await client.recall('my-bank', 'communication preferences', {
    tags: ['user:alice'],
    tagsMatch: 'any'
});
```

### CLI

```bash
hindsight memory recall my-bank "communication preferences" \
  --tags "user:alice" --tags-match any
```

### Go

```go
# Section 'recall-with-tags' not found in api/recall.go
```

Use this for **shared global knowledge + user-specific** patterns, where untagged memories represent information everyone should see.

#### `any_strict` — OR matching, excludes untagged

Same as `any` but untagged memories are excluded.

### Python

```python
response = client.recall(
    bank_id="my-bank",
    query="communication preferences",
    tags=["user:alice"],
    tags_match="any_strict",
)
# Returns:
#   [match]    "Alice prefers async communication"     — has "user:alice"
#   [no match] "Bob dislikes long meetings"             — no overlap with ["user:alice"]
#   [match]    "Team uses Slack for announcements"      — has "user:alice"
#   [no match] "Company policy: no meetings on Fridays" — untagged, excluded
```

### Node.js

```javascript
await client.recall('my-bank', 'communication preferences', {
    tags: ['user:alice'],
    tagsMatch: 'any_strict'
});
```

### CLI

```bash
hindsight memory recall my-bank "communication preferences" \
  --tags "user:alice" --tags-match any_strict
```

### Go

```go
# Section 'recall-tags-strict' not found in api/recall.go
```

Use this when memories are **fully partitioned by tags** and untagged memories should never be visible.

#### `all` — AND matching, includes untagged

Returns memories that have **every** specified tag, plus untagged memories.

### Python

```python
response = client.recall(
    bank_id="my-bank",
    query="communication tools",
    tags=["user:alice", "team"],
    tags_match="all",
)
# Returns:
#   [no match] "Alice prefers async communication"     — missing "team"
#   [no match] "Bob dislikes long meetings"             — missing both tags
#   [match]    "Team uses Slack for announcements"      — has both "user:alice" and "team"
#   [match]    "Company policy: no meetings on Fridays" — untagged, included by default
```

### Node.js

```javascript
await client.recall('my-bank', 'communication tools', {
    tags: ['user:alice', 'team'],
    tagsMatch: 'all'
});
```

### CLI

```bash
hindsight memory recall my-bank "communication tools" \
  --tags "user:alice,team" --tags-match all
```

### Go

```go
# Section 'recall-tags-all-mode' not found in api/recall.go
```

Use this when memories must belong to a **specific intersection** of scopes (e.g., only memories relevant to both a user and a project), while still surfacing shared global knowledge.

#### `all_strict` — AND matching, excludes untagged

Returns memories that have **every** specified tag, and excludes untagged memories.

### Python

```python
response = client.recall(
    bank_id="my-bank",
    query="communication tools",
    tags=["user:alice", "team"],
    tags_match="all_strict",
)
# Returns:
#   [no match] "Alice prefers async communication"     — missing "team"
#   [no match] "Bob dislikes long meetings"             — missing both tags
#   [match]    "Team uses Slack for announcements"      — has both "user:alice" and "team"
#   [no match] "Company policy: no meetings on Fridays" — untagged, excluded
```

### Node.js

```javascript
await client.recall('my-bank', 'communication tools', {
    tags: ['user:alice', 'team'],
    tagsMatch: 'all_strict'
});
```

### CLI

```bash
hindsight memory recall my-bank "communication tools" \
  --tags "user:alice,team" --tags-match all_strict
```

### Go

```go
# Section 'recall-tags-all' not found in api/recall.go
```

Use this for strict scope enforcement where a memory must explicitly belong to **all** specified contexts.

> **💡 Extra tags are fine**
> 
A memory with tags `["user:alice", "team", "project:x"]` will still match a filter of `["user:alice", "team"]` under `all_strict` — extra tags on the memory are not a problem. The filter only requires the memory to contain **at least** the specified tags.
### tag_groups

`tag_groups` is a list of compound boolean tag filters. The groups in the list are AND-ed together at the top level. Each group is a recursive boolean expression: a **leaf** node `{tags, match}`, or a **compound** node `{and: [...]}`, `{or: [...]}`, or `{not: ...}`.

`tag_groups` and `tags` / `tags_match` can be used simultaneously — they are AND-ed together.

#### Leaf node

```json
{ "tags": ["step:5", "step:8"], "match": "any_strict" }
```

`match` accepts the same values as `tags_match`: `any`, `all`, `any_strict`, `all_strict`. Defaults to `any_strict`.

#### Compound nodes

```json
{ "and": [ <TagGroup>, <TagGroup>, ... ] }
{ "or":  [ <TagGroup>, <TagGroup>, ... ] }
{ "not": <TagGroup> }
```

#### Examples

**Step filter AND user scope** — two top-level groups AND-ed:

```json
{
  "tag_groups": [
    { "tags": ["step:5", "step:8", "step:12"], "match": "any_strict" },
    { "tags": ["user:ep_42"], "match": "all_strict" }
  ]
}
```

**Nested OR inside AND** — user must match, plus either step OR priority:

```json
{
  "tag_groups": [
    { "tags": ["user:alice"], "match": "all_strict" },
    { "or": [
        { "tags": ["step:5"], "match": "any_strict" },
        { "tags": ["priority:high"], "match": "all_strict" }
    ]}
  ]
}
```

**Exclusion** — user must match, but archived memories are excluded:

```json
{
  "tag_groups": [
    { "tags": ["user:alice"], "match": "all_strict" },
    { "not": { "tags": ["archived"], "match": "any_strict" } }
  ]
}
```

### trace

When set to `true`, the response includes a detailed debug trace covering the query embedding, entry points, per-strategy retrieval results, RRF fusion candidates, reranked results, temporal constraints detected, and per-phase timings. Has no effect on the retrieval logic itself. Useful for understanding why specific memories were or were not returned.

---

## Response

### results

The main list of recalled facts, ordered by relevance. Relevance is computed by running four retrieval strategies in parallel — semantic similarity, BM25 keyword, graph traversal, and temporal — fusing their rankings with Reciprocal Rank Fusion (RRF), then re-scoring the merged candidates with a cross-encoder reranker against the original query.

Results do not include a numeric score. Raw retrieval scores are not meaningful on an absolute scale — a score of 0.8 from one query tells you nothing useful compared to a score of 0.8 from another. What matters is the relative ordering, which is already reflected in the list order. Agents should consume memories in order and let `max_tokens` determine how many fit, rather than filtering by score.

Each item in `results` has the following fields:

#### id

The unique identifier of this fact. Use it to cross-reference with `source_facts` or for application-level deduplication.

#### text

The extracted fact text as stored in the memory bank.

#### type

The fact category: `world` for objective information, `experience` for events and conversations, or `observation` for consolidated knowledge synthesized over time.

#### context

The context label provided when the fact was retained (e.g., `"team meeting"`, `"slack"`). `null` if none was set.

#### metadata

The key-value string pairs attached when the fact was retained. `null` if none were set.

#### tags

The visibility-scoping tags attached to this fact.

#### entities

A list of canonical entity name strings linked to this fact. Only populated when `include.entities` is enabled (the default). `null` otherwise.

#### occurred_start / occurred_end

ISO 8601 datetimes representing when the described event started and ended. Extracted by the LLM from the content during retain. `null` if the content had no temporal information.

#### mentioned_at

ISO 8601 datetime of when this fact was retained into the bank.

#### document_id

The document ID this fact belongs to, as set during retain.

#### chunk_id

The ID of the source text chunk this fact was extracted from. Used to cross-reference with `chunks` in the response when `include.chunks` is enabled.

#### source_fact_ids

For `observation`-type results only: the IDs of the original facts this observation was synthesized from. Cross-references with `source_facts` in the response. `null` for other types or when `include.source_facts` is not enabled.

---

### source_facts

A dict keyed by fact ID containing full `RecallResult` objects for the source facts that contributed to observation results. Only present when `include.source_facts` is enabled. Facts are deduplicated — if two observations share a source fact, it appears once.

### chunks

A dict keyed by chunk ID containing the raw source text chunks associated with the returned facts. Only present when `include.chunks` is enabled. Each chunk has `id`, `text`, `chunk_index`, and `truncated` (whether the text was cut to fit the token budget).

### entities

A dict keyed by canonical entity name containing entity state objects. Only present when `include.entities` is enabled. Each entry has `entity_id`, `canonical_name`, and `observations`.

### trace

A debug object present only when `trace: true` was set in the request. Contains per-phase timings, retrieval breakdowns, and RRF fusion details.
