# AI Agents — Tool Use, Planning, and Autonomous Workflows

<!-- hint:slides topic="AI agents: observe-think-act loop, tool use and MCP, planning strategies, memory, and multi-agent systems" slides="6" -->

## What Makes an Agent?

An **AI agent** combines an LLM with **tools** and a **control loop**. Instead of just generating text, the agent can take actions: search the web, run code, query a database, call APIs. The loop: observe → think → act → observe.

## The Agent Loop

```mermaid
flowchart TB
    subgraph loop["Agent Loop"]
        O[Observe: state, tools, feedback]
        T[Think: plan, decide]
        A[Act: call tool or respond]
    end
    O --> T --> T --> A
    A -->|Tool result| O
    A -->|Final answer| End[End]
```

1. **Observe** — Current state, available tools, user input, tool outputs
2. **Think** — Plan next step, choose tool or respond
3. **Act** — Execute tool call or produce final answer
4. **Observe** — Incorporate tool result; repeat until done

## Tool Use / Function Calling

**Function calling** (or tool use) lets the model request structured actions. You define tools with:

- **Name** — e.g., `search_web`
- **Description** — What it does; the model uses this to choose
- **Parameters** — JSON schema (inputs the tool needs)

The model outputs a tool call (name + arguments); your code executes it and returns the result; the model sees the result and continues.

## MCP — Model Context Protocol

**MCP** is an open standard for exposing tools to LLMs. Instead of each app defining its own format, MCP defines:

- How tools are described (schemas)
- How to invoke them
- How to stream results

Tools (file systems, databases, APIs, custom logic) can be packaged as MCP servers and used by any MCP-compatible client.

## Planning Strategies

| Strategy | Idea |
|----------|------|
| **ReAct** | Reason + Act: alternate reasoning steps and tool calls in the same output |
| **Chain-of-thought planning** | Plan steps in text before acting; then execute |
| **Reflexion** | Reflect on failures; update plan and retry |

The agent chooses when to use tools vs. when to answer directly.

## Memory

- **Short-term** — The current conversation (context window). All prior turns and tool results.
- **Long-term** — External store (vector DB, key-value) the agent can read/write across sessions.

Agents often use RAG for long-term knowledge and a session store for the current task.

## Multi-Agent Systems

Multiple agents can cooperate:

```mermaid
flowchart TB
    subgraph orch["Orchestrator"]
        O[Coordinator Agent]
    end
    subgraph workers["Workers"]
        A1[Researcher]
        A2[Coder]
        A3[Reviewer]
    end
    O --> A1
    O --> A2
    O --> A3
    A1 --> O
    A2 --> O
    A3 --> O
```

- **Orchestrator** — Delegates subtasks, merges results
- **Specialists** — Researcher (search), Coder (write code), Reviewer (check quality)

## Safety

| Concern | Mitigation |
|---------|------------|
| **Sandboxing** | Run tools in isolated env; limit file/network access |
| **Human-in-the-loop** | Require approval for sensitive actions (e.g., deploy, pay) |
| **Guardrails** | Validate inputs/outputs; block dangerous patterns |
| **Rate limits** | Prevent runaway loops and cost explosions |

## Real-World Examples

- **Claude Code** — Coding agent with file edit, shell, web search
- **Research agents** — Search, summarize, cite
- **Customer support** — Query KB, create tickets, escalate

## Current Limitations

- **Hallucinated tool calls** — Model may invent tool names or parameters
- **Infinite loops** — Agent keeps retrying without progress
- **Cost** — Many tool calls = many tokens
- **Brittleness** — Unclear tool descriptions lead to wrong choices

---

## Key Takeaways

1. **Agent = LLM + tools + loop** — observe, think, act, repeat
2. **Tool use** — Define schemas; model chooses and calls; you execute
3. **MCP** — Standard way to expose tools to LLMs
4. **Planning** — ReAct, CoT, Reflexion; balance reasoning vs. acting
5. **Safety** — Sandbox, human-in-the-loop, guardrails