# Changelog v0.3.80 - 2026-05-30

### Changed
- Split the former single **AI Speed** table value into two clearer columns: **AI Latency** for benchmark wall-clock response time and **TPS** for rounded tokens per second. When a model's Health is not good, AI Latency now mirrors the exact Health text and TPS stays `—`, so benchmark columns do not introduce a second, conflicting error language.
- Made the AI benchmark prompt request one cohesive 80–100 word paragraph instead of a tiny one-sentence answer, and raised the benchmark cap to 140 output tokens. This gives latency and TPS measurements enough generated text to be more stable and meaningful.

### Fixed
- Disabled reasoning/thinking for lightweight model ping payloads by sending `thinking: { type: "disabled" }` on OpenAI-compatible chat-completion probes. This keeps latency checks focused on network/model availability instead of accidentally paying for hidden reasoning tokens.
- Reused the same disabled-thinking ping payload for router health probes, while leaving Replicate prediction probes unchanged because they do not use the OpenAI chat-completions schema.
