# Changelog

All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [0.1.0] - 2025-11-13

### Added

- **Core Metrics Collection**
  - Agent metrics tracking for LLM agents and AI systems
  - Tool metrics tracking for function calls and RAG operations
  - Latency metrics for performance monitoring
  - Request timing metrics for client-server analysis

- **Type Safety**
  - Full TypeScript support with strict types
  - Comprehensive type definitions for all metric types
  - IntelliSense support for better developer experience

- **Validation System**
  - Built-in metric validation (configurable)
  - Validation for agent, tool, latency, and request timing metrics
  - Detailed error messages for invalid data

- **Persistence Layer**
  - In-memory storage with configurable limits
  - Pluggable persistence backend interface
  - Support for custom database backends (PostgreSQL, MongoDB, Redis, etc.)

- **Helper Functions**
  - `measureAgent()` - Automatic agent execution measurement
  - `measureAgentWithMetadata()` - Agent measurement with custom metadata extraction
  - `measureTool()` - Automatic tool execution measurement
  - `measureToolWithMetadata()` - Tool measurement with custom metadata extraction

- **Formatting Utilities**
  - `formatDuration()` - Human-readable duration formatting
  - `formatDurationDetailed()` - Detailed duration breakdown
  - `formatAgentMetrics()` - Formatted agent metrics output
  - `formatToolMetrics()` - Formatted tool metrics output
  - `formatLatencyMetrics()` - Formatted latency metrics output
  - `formatMetricsSummary()` - Formatted summary statistics

- **Aggregation & Analysis**
  - Summary statistics calculation
  - Time-range filtering for metrics
  - Context-based metric queries
  - Average, min, max, and total calculations

- **Configuration**
  - Configurable max metrics limit
  - Optional metric validation
  - Custom logger support
  - Custom persistence backend support

- **Testing**
  - Comprehensive test suite (64 tests, 165 assertions)
  - Unit tests for all core modules
  - Validation tests
  - Formatting tests

- **Documentation**
  - Complete README with examples
  - API reference documentation
  - Usage examples for common scenarios
  - Best practices guide

- **Developer Experience**
  - Zero runtime dependencies
  - ESM-only (modern JavaScript)
  - Bun and Node.js 22+ support
  - TypeScript 5.6+ support

### Technical Details

- **Package**: Published to npm as `llm-metrics@0.1.0`
- **License**: MIT
- **Repository**: https://github.com/Arakiss/llm-metrics
- **Build**: TypeScript compilation to ESM
- **Test Coverage**: 64 tests across 6 test files

---

## [0.2.0] - 2025-11-13

### Added
- Export utilities for JSON and CSV formats (`exportToJSON`, `exportAgentsToCSV`, `exportAllToCSV`, etc.)
- Advanced aggregations with percentiles (`aggregateAgents`, `aggregateTools`, `aggregateLatency`)
- Histogram generation (`createHistogram`, `createAgentHistogram`, `createToolHistogram`, `createLatencyHistogram`)
- Comprehensive edge case tests for collector error handling and format functions
- Examples directory with real-world usage examples:
  - Next.js API route integration
  - Express middleware integration
  - AI SDK (Vercel) integration
  - Export metrics examples
  - Aggregations examples
- Adapters API documentation (`src/adapters/README.md`) with examples for PostgreSQL, MongoDB, and Redis
- VS Code snippets (`.vscode/snippets.json`) for common patterns
- GitHub issue and PR templates
- Test coverage improvements (110 tests, 256 assertions)

### Changed
- Improved test coverage from ~91% to ~95%+ with edge case tests
- Enhanced `formatDuration` to handle edge cases more accurately
- Updated README with links to examples and adapters documentation

### Fixed
- Fixed percentile calculation edge cases in aggregations
- Fixed duration formatting for edge cases (9.9s, very large durations)

## [0.3.0] - 2025-11-13

### Added
- Event hooks/callbacks system (`MetricsEventCallbacks`)
  - `onAgentRecorded` - Called when agent metrics are recorded
  - `onToolRecorded` - Called when tool metrics are recorded
  - `onLatencyRecorded` - Called when latency metrics are recorded
  - `onRequestTimingRecorded` - Called when request timing metrics are recorded
- `setCallbacks()` method to configure callbacks after construction
- Event hooks example (`examples/event-hooks.ts`)
- Comprehensive tests for event hooks (15 new tests)

### Changed
- `MetricsCollectorConfig` now includes optional `callbacks` field
- Callbacks are called after metrics are recorded and persisted
- Callback errors are caught and logged, preventing crashes

## [0.4.0] - 2025-11-13

### Added
- Flexible query/filter API (`queryMetrics()` method)
  - Filter by multiple context IDs (`contextIds`)
  - Filter by agent IDs (`agentIds`)
  - Filter by tool names (`toolNames`)
  - Filter by operation names (`operations`)
  - Filter by time range (`startTime`, `endTime`)
  - Filter by duration range (`minDuration`, `maxDuration`)
  - Filter by success status (`success`)
  - Filter by error presence (`hasError`)
  - Filter by metadata key-value pairs (`metadata`)
- Query utility functions (`filterAgents`, `filterTools`, `filterLatency`, `filterRequestTimings`)
- `MetricsFilter` interface for type-safe filtering
- Query/filter example (`examples/query-filter.ts`)
- Comprehensive tests for query/filter API (15 new tests)

### Changed
- `queryMetrics()` method added to `MetricsCollector` for flexible querying
- All filters support combining multiple criteria

## [0.5.0] - 2025-11-13

### Added
- Batch operations for efficient metric recording
  - `recordAgents(metricsArray)` - Record multiple agents in batch
  - `recordTools(metricsArray)` - Record multiple tools in batch
  - `recordLatencies(metricsArray)` - Record multiple latency metrics in batch
  - `recordRequestTimings(metricsArray)` - Record multiple request timings in batch
- Batch operations example (`examples/batch-operations.ts`)
- Comprehensive tests for batch operations (8 new tests)
- `getSnapshot()` now includes `requestTimings` field

### Changed
- Batch operations validate each metric individually
- Batch operations are more efficient than individual `record*()` calls
- Useful for migrations, imports, and bulk operations

## [0.6.0] - 2025-11-13

### Added
- Derived metrics calculations
  - `calculateRate()` - Calculate operations per second
  - `calculateErrorRate()` - Calculate error rate percentage
  - `calculateSuccessRate()` - Calculate success rate percentage
  - `calculateAgentDerivedMetrics()` - Comprehensive agent metrics with rates
  - `calculateToolDerivedMetrics()` - Comprehensive tool metrics with rates
  - `calculateTrend()` - Calculate trends between time periods
  - `calculateErrorRateTrend()` - Calculate error rate trends
- Derived metrics types (`AgentDerivedMetrics`, `ToolDerivedMetrics`, `TrendMetrics`)
- Indexing utilities (`createContextIndex`, `createAgentIndex`, `createToolIndex`)
- Derived metrics example (`examples/derived-metrics.ts`)
- Comprehensive tests for derived metrics (12 new tests)

### Changed
- Derived metrics are simple and optional (not part of core collector)
- Indexing utilities are exported for advanced use cases

## [0.7.0] - 2025-11-14

### Added
- **Timeline Analysis Module** - Comprehensive timeline breakdown utilities
  - `generateTimelineBreakdown()` - Generate complete timeline analysis from request timing metrics
  - `formatTimelineAnalysis()` - Format timeline analysis for logging/display
  - `TimelineBreakdown` interface - Structured timeline event breakdown
  - `TimingComparison` interface - Server vs client timing comparison
- **Automatic Timeline Analysis** - Configurable automatic timeline generation
  - `timelineAnalysis` configuration option in `MetricsCollectorConfig`
  - Automatic timeline breakdown generation when client timing is available
  - Configurable log level (`info`, `debug`, `warn`)
  - Custom duration formatter support
- **Enhanced Request Timing Metrics**
  - `clientFirstDataTime` field added to `RequestTimingMetrics` interface
  - Better support for client-server timing comparison
- **Timeline Analysis Integration**
  - Automatic timeline analysis triggered when `recordRequestTiming()` is called with client timing
  - Non-blocking async timeline generation (doesn't affect request performance)
  - Optional feature (disabled by default, can be enabled via configuration)

### Changed
- `MetricsCollectorConfig` now includes optional `timelineAnalysis` configuration
- `configure()` method now supports updating timeline analysis configuration
- Timeline analysis is exported from main package (`export * from './timeline'`)

### Technical Details
- Timeline analysis uses dynamic imports to avoid circular dependencies
- Timeline generation is async and non-blocking (uses setTimeout)
- Fully backward compatible - timeline analysis is opt-in

## [Unreleased]

### Planned

- Performance optimizations (internal indexing integration, lazy evaluation)
- Additional derived metrics as needed

[0.7.0]: https://github.com/Arakiss/llm-metrics/releases/tag/v0.7.0
[0.6.0]: https://github.com/Arakiss/llm-metrics/releases/tag/v0.6.0

[0.5.0]: https://github.com/Arakiss/llm-metrics/releases/tag/v0.5.0

[0.4.0]: https://github.com/Arakiss/llm-metrics/releases/tag/v0.4.0

[0.2.0]: https://github.com/Arakiss/llm-metrics/releases/tag/v0.2.0
[0.1.0]: https://github.com/Arakiss/llm-metrics/releases/tag/v0.1.0

