# Parser Module data flow

## What It Does
Lightweight tokenization and parsing layer that converts formula strings into a parsed AST and provides a pluggable evaluator. It isolates lexical rules, operator precedence, function/cell resolution, and produces a deterministic AST suitable for caching and reuse by the calculate engine.

## Entry Points
- `tokenize(input: string): Token[]` — produce token stream from formula text.
- `parse(input: string): ASTNode` — parse tokens into an AST (recursive-descent / precedence-climbing).
- `evaluate(ast: ASTNode, options?: {cellResolver?, functionResolver?}): any` — evaluate AST with pluggable resolvers.

## ASCII Core Logic Flow

Raw formula text
        ↓
`tokenize()` → stream of `Token` objects (Number/String/Identifier/Cell/Operator/Paren/Comma)
        ↓
`parse()` → recursive-descent parser builds `ASTNode` tree (Number/String/Cell/Unary/Binary/FunctionCall)
        ↓
(Optionally) cache parsed `ASTNode` in `FormulaInfo` for reuse
        ↓
`evaluate(ast, {cellResolver,functionResolver})` → recursive AST walk
        ↓
resolvers return values → compute operators/functions → final value
        ↓
return value or throw/report parse/runtime error

## Operations & Key Functions
- Lexing
  - Recognizes numbers, quoted strings, identifiers, basic A1-style cell refs, multi-char operators (<=,>=,<>), commas and parentheses.
  - Produces normalized identifiers for function lookup.
- Parsing
  - Uses precedence-climbing for binary operators and recursive descent for primary expressions and function calls.
  - Detects unary +/-, parentheses grouping, and zero-argument identifiers (treated as functions or named values).
- AST Shape
  - Node kinds: `Number`, `String`, `Cell`, `Unary`, `Binary`, `FunctionCall`.
  - Function calls contain ordered `args[]` for evaluation.
- Evaluation
  - Expects `cellResolver(ref)` to resolve cell tokens to values (and optionally mark dependencies).
  - Expects `functionResolver(name,args)` to invoke built-in or user-defined functions with typed args.
  - Defines numeric/operator semantics and fallback behaviors (e.g., divide-by-zero handling) at evaluator layer.
- Caching & Integration
  - Parser output is designed to be cached by `FormulaInfo` (avoid reparsing on repeat recalculations).
  - AST is evaluator-agnostic: different resolvers enable headless unit testing and grid integration.

## Validation & Safety
- Syntax validation: mismatched parentheses, unterminated strings, unexpected tokens are detected at parse-time and should surface structured parse errors.
- Identifier/Cell normalization: parser upper-cases identifiers for predictable function lookups; cell ref patterns validated with regex.
- Runtime safety: evaluator should guard against division by zero, numeric overflow, and invalid argument counts — returning error tokens or throwing `FormulaError` consumed by the calculate engine.
- Injection safety: parser only recognizes a limited grammar; it does not execute arbitrary code — functionResolver is the only bridge to custom behavior.
- Determinism: parsing/evaluation avoid global state; resolvers must be pure or documented as side-effecting to prevent nondeterministic results.

## Desired Outputs

User-Facing
- Faster recalculation when parsed ASTs are cached (snappier UI updates).
- Clear parse-time error messages for malformed formulas.

System-Level
- **AST Caching:** parsed `ASTNode` stored in `FormulaInfo` for reuse across refresh cycles.
- **Resolvers:** engine provides `cellResolver` and `functionResolver` to integrate with dependency tracking and function libraries.
- **Error Reporting:** structured parse/runtime errors returned to the calculate layer which emits `onFailure` events and attaches error codes to cell models.
- **No DOM:** parser is headless and emits no DOM classes; UI rendering is driven by calculate/grid layers consuming the evaluation results.
- **Undo/Redo:** parser itself is stateless and doesn't record history; callers should create undo records when formula text changes or parsed ASTs are invalidated.

## Implementation Notes
- Keep tokenization and parsing fast and allocation-light to support large recalculation workloads.
- Normalize identifiers and cell tokens at lex time to simplify function lookup and caching keys.
- Ensure AST nodes are shallow POJOs suitable for JSON serialization when needed for debugging or telemetry.
- Make evaluator resolvers small and testable: a mock `cellResolver` enables isolated unit tests for functions and operator semantics.