# Tone consistency scorer

The `createToneScorer()` function evaluates the text's emotional tone and sentiment consistency. It can operate in two modes: comparing tone between input/output pairs or analyzing tone stability within a single text.

## Parameters

The `createToneScorer()` function doesn't take any options.

This function returns an instance of the MastraScorer class. See the [MastraScorer reference](https://mastra.ai/reference/evals/mastra-scorer) for details on the `.run()` method and its input/output.

## `.run()` returns

**runId** (`string`): The id of the run (optional).

**analyzeStepResult** (`object`): Object with tone metrics: { responseSentiment: number, referenceSentiment: number, difference: number } (for comparison mode) OR { avgSentiment: number, sentimentVariance: number } (for stability mode)

**score** (`number`): Tone consistency/stability score (0-1).

`.run()` returns a result in the following shape:

```typescript
{
  runId: string,
  analyzeStepResult: {
    responseSentiment?: number,
    referenceSentiment?: number,
    difference?: number,
    avgSentiment?: number,
    sentimentVariance?: number,
  },
  score: number
}
```

## Scoring details

The scorer evaluates sentiment consistency through tone pattern analysis and mode-specific scoring.

### Scoring Process

1. Analyzes tone patterns:

   - Extracts sentiment features
   - Computes sentiment scores
   - Measures tone variations

2. Calculates mode-specific score: **Tone Consistency** (input and output):

   - Compares sentiment between texts
   - Calculates sentiment difference
   - Score = 1 - (sentiment\_difference / max\_difference) **Tone Stability** (single input):
   - Analyzes sentiment across sentences
   - Calculates sentiment variance
   - Score = 1 - (sentiment\_variance / max\_variance)

Final score: `mode_specific_score * scale`

### Score interpretation

(0 to scale, default 0-1)

- 1.0: Perfect tone consistency/stability
- 0.7-0.9: Strong consistency with minor variations
- 0.4-0.6: Moderate consistency with noticeable shifts
- 0.1-0.3: Poor consistency with major tone changes
- 0.0: No consistency - completely different tones

### `analyzeStepResult`

Object with tone metrics:

- **responseSentiment**: Sentiment score for the response (comparison mode).
- **referenceSentiment**: Sentiment score for the input/reference (comparison mode).
- **difference**: Absolute difference between sentiment scores (comparison mode).
- **avgSentiment**: Average sentiment across sentences (stability mode).
- **sentimentVariance**: Variance of sentiment across sentences (stability mode).

## Example

Evaluate tone consistency between related agent responses:

```typescript
import { runEvals } from '@mastra/core/evals'
import { createToneScorer } from '@mastra/evals/scorers/prebuilt'
import { myAgent } from './agent'

const scorer = createToneScorer()

const result = await runEvals({
  data: [
    {
      input: 'How was your experience with our service?',
      groundTruth: 'The service was excellent and exceeded expectations!',
    },
    {
      input: 'Tell me about the customer support',
      groundTruth: 'The support team was friendly and very helpful.',
    },
  ],
  scorers: [scorer],
  target: myAgent,
  onItemComplete: ({ scorerResults }) => {
    console.log({
      score: scorerResults[scorer.id].score,
    })
  },
})

console.log(result.scores)
```

For more details on `runEvals`, see the [runEvals reference](https://mastra.ai/reference/evals/run-evals).

To add this scorer to an agent, see the [Scorers overview](https://mastra.ai/docs/evals/overview) guide.

## Related

- [Content Similarity Scorer](https://mastra.ai/reference/evals/content-similarity)
- [Toxicity Scorer](https://mastra.ai/reference/evals/toxicity)