# RAG Fundamentals Quick Reference

## RAG Pipeline

**Indexing:** Docs → Chunk → Embed → Store (vector DB)

**Query:** Query → Embed → Retrieve → Augment prompt → Generate

## Embeddings

- Text → vectors (numbers)
- Similar meaning → similar vectors
- Use cosine similarity or dot product for retrieval

## Vector Databases

| Tool      | Use Case          |
|-----------|-------------------|
| Pinecone  | Managed, scalable  |
| Weaviate  | Open-source, hybrid|
| Chroma    | Lightweight        |
| pgvector  | PostgreSQL         |

## Chunking

| Strategy    | Pros / Cons                    |
|-------------|--------------------------------|
| Fixed-size  | Simple; can split mid-sentence|
| Semantic    | Preserves meaning; uneven size |
| Recursive   | Flexible; more config         |
| Overlap     | Keeps context at boundaries   |

## Retrieval Tuning

- **Top-k** — How many chunks (often 3–10)
- **Threshold** — Minimum similarity score
- **Re-ranking** — Second pass for precision
- **Hybrid** — Keyword + semantic

## Evaluation

- **Faithfulness** — Grounded in context?
- **Relevance** — Right chunks retrieved?
- **Correctness** — Factually right?

## Common Pitfalls

- Bad chunking → tune size, overlap, strategy
- No metadata → add filters (source, date)
- Too much context → limit k, re-rank
- Wrong embeddings → match model to domain

## One-Liners

- **RAG** — Retrieve → Augment → Generate.
- **Embeddings** — Semantic similarity, not just keywords.
- **Chunking** — Semantic beats fixed for coherence.
- **Evaluate** — Faithfulness, relevance, correctness.
