--- title: "Evals Quickstart" description: Add quality scoring to your Smithers workflow in under five minutes. --- This guide walks you through adding scorers to an existing workflow. By the end you will have live scoring on every task run, with results visible in the CLI and TUI. ## Prerequisites - A working Smithers workflow (see [Tutorial: Build a Workflow](/guides/tutorial-workflow)) - At least one `` with an agent ## Step 1: Import Scorers ```tsx import { schemaAdherenceScorer, latencyScorer, relevancyScorer, } from "smithers-orchestrator/scorers"; ``` ## Step 2: Attach Scorers to a Task Add the `scorers` prop to any ``: ```tsx ``` These two scorers are code-based and require no additional LLM calls. ## Step 3: Add LLM-based Scoring (Optional) For LLM-as-judge evaluation, pass an agent to the scorer factory: ```tsx import { AnthropicAgent } from "smithers-orchestrator"; const judge = new AnthropicAgent({ model: "claude-sonnet-4-20250514", }); ``` ## Step 4: Run Your Workflow ```bash smithers up workflow.tsx ``` If you are running a discovered workflow from `.smithers/workflows`, use `smithers workflow run ` instead. Scorers run asynchronously after each task finishes. They never slow down your workflow. ## Step 5: View Scores ### CLI ```bash # List all scores for a run smithers scores ``` Example output: ``` Scores for run abc123 ┌──────────┬────────────────────┬───────┬───────────────────────────────┐ │ Node │ Scorer │ Score │ Reason │ ├──────────┼────────────────────┼───────┼───────────────────────────────┤ │ analyze │ Schema Adherence │ 1.00 │ Output matches schema │ │ analyze │ Latency │ 0.85 │ 7200ms (target: 5000ms) │ │ analyze │ Relevancy │ 0.92 │ Output directly addresses ... │ └──────────┴────────────────────┴───────┴───────────────────────────────┘ ``` ### TUI Open the TUI with `smithers tui`, navigate to a task, and switch to the **Scores** tab to see per-task scoring results. ## Step 6: Custom Scorers Build your own scorer with `createScorer`: ```ts import { createScorer } from "smithers-orchestrator/scorers"; const wordCountScorer = createScorer({ id: "word-count", name: "Word Count", description: "Scores based on output word count", score: async ({ output }) => { const words = String(output).split(/\s+/).length; const score = Math.min(words / 200, 1); return { score, reason: `Output contains ${words} words`, }; }, }); ``` ## Step 7: LLM-as-Judge Custom Scorers Use `llmJudge` to build custom LLM-based scorers: ```ts import { llmJudge } from "smithers-orchestrator/scorers"; const toneScorer = llmJudge({ id: "professional-tone", name: "Professional Tone", description: "Evaluates if the output maintains a professional tone", judge, instructions: "You evaluate whether text maintains a professional, business-appropriate tone.", promptTemplate: ({ input, output }) => `Rate the professionalism of this response on a scale of 0-1.\n\nInput: ${String(input)}\n\nOutput: ${String(output)}\n\nRespond with a JSON object: { "score": , "reason": "" }`, }); ``` ## Batch Evaluation For testing and offline evaluation, use `runScorersBatch` directly: ```ts import { runScorersBatch } from "smithers-orchestrator/scorers"; const results = await runScorersBatch( { myScorer: { scorer: schemaAdherenceScorer() }, }, { runId: "test-run", nodeId: "analyze", iteration: 0, attempt: 0, input: "Analyze this code", output: { summary: "The code is clean" }, outputSchema: analysisSchema, }, adapter, ); ```