# xml-xsd-engine — Planned Roadmap

> **Current version: v1.7.0** (branch `v1_7_0` — post-release stabilisation complete, 2460 tests)  
> For completed features (v1.0–v1.7) see [ROADMAP-COMPLETED.md](./ROADMAP-COMPLETED.md)

---

## v1.7 — Streaming Validation Foundation ✅ COMPLETED

All v1.7 items were implemented on branch `v1_7_0` as part of the stability / readiness work.  
They will ship in the next tagged release alongside v1.7.0 improvements.

| Item | Ref | Status |
|------|-----|--------|
| DFA-based content-model validation | G28 | ✅ Done |
| Streaming validation (SAX to validator, no DOM) | G1 | ✅ Done |
| Full DFA construction per compositor in schema compiler | G27 | ✅ Done |
| Incremental / partial validation (subtree revalidation) | G38 | ✅ Done |
| xs:keyref streaming-aware referential check | G28b | ✅ Done |
| Streaming validation result (async generator) | — | ✅ Done |

See [ROADMAP-COMPLETED.md](./ROADMAP-COMPLETED.md) for full details.

---

## v1.7 Post-Release — Known Issues to Fix

> Picked up from [ISSUES.md](./ISSUES.md) — Fix before tagging v1.7.0.

| Item | Issue Ref | Priority | Status |
|------|-----------|----------|--------|
| `StreamingKeyrefTracker` full XPath selector evaluation | T-01 | P0 | ✅ Fixed |
| `xsi:type` streaming override: qualified name + simple type fallback | T-02 | P0 | ✅ Fixed |
| `CLI --streaming` flag wired to actual streaming path | A-07 | P1 | ✅ Fixed |
| `xs:all` maxOccurs=1 enforcement in streaming DFA | T-03 | P1 | ✅ Fixed |
| `TypeValidator._wsCache` clear on schema reload | T-04 | P1 | ✅ Fixed |
| `revalidateSubtree` PSVI propagation | T-05 | P1 | ✅ Fixed |
| `elementFormDefault="qualified"` matching | T-06 | P1 | ✅ Fixed |
| `xs:assert` child text / multi-level path support | T-08 | P2 | ✅ Fixed |
| `xs:group ref` occurrence constraints in DFA compilation | T-09 | P2 | ✅ Fixed |
| `XmlDiff` fingerprint cache per-call (P-01) | P-01 | P2 | ✅ Fixed |
| `SchemaCompilerLite` diamond inheritance memoization | P-04 | P2 | ✅ Fixed |
| `StreamingKeyrefTracker.validate()` tuple count limit (DoS) | S-03 | P2 | ✅ Fixed |
| `xsi:type` type-confusion security review | S-01 | P1 | ✅ Fixed |

---

## v1.9 — XPath 2.0, Schema Completion, Compliance

| Item | Ref | Priority |
|------|-----|----------|
| XPath 2.0 execution model formalization (lazy iterator) | G33/L0 | P0 |
| XPath 2.0 remaining functions (date/time, node-construction, range) | G6 | P1 |
| Formal XsdType interface (derive, normalize, validate) | G32 | P1 |
| PSVI completeness for all 56 built-in types | G29 | P1 |
| Canonical XML 1.1 support | — | P2 |
| XPath predicate optimization (index-backed attribute lookup) | — | P2 |
| `xs:union` memberTypeDefs serialization (T-07) | — | P2 |
| W3C XSD test suite runner (conformance tracking) | — | P2 |

---

## v2.0 — Full XSD 1.0 Compliance + Streaming-First

| Item | Ref | Priority |
|------|-----|----------|
| Streaming-first architecture (DOM fully optional) | G1 | P0 |
| Substitution group full implementation | — | P1 |
| Abstract type enforcement | — | P1 |
| xsi:type runtime type substitution (complete) | — | P1 |
| Namespace-qualified schema element matching (T-06) | G35 | P1 |
| elementFormDefault / attributeFormDefault rules (complete) | — | P2 |
| `ValidationEngine` / `StreamingValidator` unified interface (A-01) | — | P2 |
| `BatchValidator` streaming/throttled file reading (A-04) | — | P2 |
| xs:notation | — | P3 |
| Worker thread batch validation (FP-05) | — | P2 |
| Zero-copy lexer (FP-01) | — | P1 |
| DFA minimization via Hopcroft's algorithm (FP-04) | — | P2 |
| Serialized schema cache to disk (FP-07) | — | P2 |
| `parseXsdAsync` / `AsyncSchemaLoader` in browser entry (A-05) | — | P2 |

---

## v2.1 — Advanced Features

| Item | Ref | Priority |
|------|-----|----------|
| Schematron (ISO/IEC 19757-3) validation | G7 | P1 |
| XSD 1.1 xs:assert with XPath 2.0 | G24 | P1 |
| XSD 1.1 conditional type assignment | — | P2 |
| XML Digital Signatures (xmldsig) via C14N and SHA-256 | — | P2 |
| Schema-aware XML diff | G15 | P2 |
| VS Code extension via LSP | — | P3 |

---

## Developer Experience Backlog

| Feature | Description | Priority | Effort |
|---------|-------------|----------|--------|
| Schema registry | Named schema repository: `registry.get("invoice-v2")` | P2 | Medium |
| Auto-complete hints | Return valid child element list for a given path from schema | P2 | Medium |
| XSD documentation generator | Extract `xs:annotation/xs:documentation` to HTML/Markdown | P3 | Low |
| Interactive schema explorer | Browser-based type hierarchy visualization | P3 | High |
| Jest matcher | `expect(xml).toBeValidAgainst(schema)` | P2 | Low |
| ESLint plugin | Lint XML-generating code against schema | P3 | Medium |
| GitHub Action | Ready-made action for `xml-validate` in CI | P2 | Low |
| Web playground | Browser-based REPL for XML/XSD validation | P3 | Medium |

---

## Performance Backlog

> See full list in [PERFORMANCE.md — Future Optimizations](./PERFORMANCE.md#future-performance-optimizations-backlog).

| # | Item | Priority |
|---|------|----------|
| FP-01 | Zero-copy lexer (start+length spans) | P1 |
| FP-02 | Adaptive XPath LRU tuning | P2 |
| FP-03 | Merged type-check + DFA transition | P2 |
| FP-04 | DFA minimization (Hopcroft) | P2 |
| FP-05 | Worker thread batch validation | P2 |
| FP-07 | Serialized schema cache to disk | P2 |
| FP-11 | StreamingKeyrefTracker full XPath selector | P0 |
| FP-15 | Diff-based incremental revalidation | P2 |
| FP-17 | Parallel xs:import loading | P3 |

---

## Standards Compliance Targets

| Standard | Current (v1.7.0) | Target (v2.0) |
|----------|----------------------|---------------|
| XSD 1.0 | ~75% | ~95% |
| XSD 1.1 | ~15% | ~40% |
| XPath 1.0 | ~85% | ~95% |
| XPath 2.0 functions | 22/~100 | 60/~100 |
| W3C C14N 1.0 | ~90% | ~100% |
| XML Namespaces 1.1 | ~80% | ~95% |


---

## v1.7 — Streaming Validation Foundation ✅ COMPLETED (in v1.7.0 branch)

| Item | Ref | Priority | Status |
|------|-----|----------|--------|
| DFA-based content-model validation | G28 | P0 | ✅ Done |
| Streaming validation (SAX to validator, no DOM required) | G1 | P0 | ✅ Done |
| Full DFA construction per compositor in schema compiler | G27 | P0 | ✅ Done |
| Incremental / partial validation (subtree revalidation) | G38 | P1 | ✅ Done |
| xs:keyref full support (streaming-aware referential check) | G28b | P1 | ✅ Done |
| Streaming validation result (emit issues as stream / async generator) | — | P2 | ✅ Done |

---

## v1.9 — XPath and Schema Completion

| Item | Ref | Priority |
|------|-----|----------|
| XPath 2.0 execution model formalization (lazy iterator) | G33/L0 | P0 |
| XPath 2.0 remaining functions (date/time, node-construction) | G6 | P1 |
| Formal XsdType interface (derive, normalize, validate) | G32 | P1 |
| PSVI completeness for all 44 built-in types | G29 | P1 |
| Canonical XML 1.1 support | — | P2 |
| XPath predicate optimization (index-backed attribute lookup) | — | P2 |

---

## v2.0 — Full XSD 1.0 Compliance

| Item | Ref | Priority |
|------|-----|----------|
| Streaming-first architecture (DOM fully optional) | G1 | P0 |
| Substitution group full implementation | — | P1 |
| Abstract type enforcement | — | P1 |
| xsi:type runtime type substitution | — | P1 |
| Namespace-qualified schema element matching | G35 | P1 |
| elementFormDefault / attributeFormDefault rules | — | P2 |
| xs:notation | — | P3 |

---

## v2.1 — Advanced Features

| Item | Ref | Priority |
|------|-----|----------|
| Schematron (ISO/IEC 19757-3) validation | G7 | P1 |
| XSD 1.1 xs:assert with XPath 2.0 | G24 | P1 |
| XSD 1.1 conditional type assignment | — | P2 |
| XML Digital Signatures (xmldsig) via C14N and SHA-256 | — | P2 |
| Schema-aware XML diff | G15 | P2 |
| VS Code extension via LSP | G15 | P3 |

---

## Good-To-Have Features

### Developer Experience

| Feature | Description | Effort |
|---------|-------------|--------|
| Schema registry | Named schema repository: `registry.get("invoice-v2")` | Medium |
| Auto-complete hints | From schema, return valid child element list for a given path | Medium |
| XSD documentation generator | Extract xs:annotation/xs:documentation to HTML/Markdown | Low |
| Interactive schema explorer | HTML visualization of schema type hierarchy | High |
| Jest matcher | `expect(xml).toBeValidAgainst(schema)` | Low |
| ESLint plugin | Lint XML-generating code against schema | Medium |
| GitHub Action | Ready-made action for xml-validate in CI | Low |
| Web playground | Browser-based REPL for XML validation | Medium |

### Performance

| Feature | Description | Effort |
|---------|-------------|--------|
| Worker thread batch validation | Parallel validation using worker_threads | Medium |
| WASM build | Near-native speed in browsers and serverless | High |
| Schema compilation caching to disk | Persist compiled schemas between Node.js runs | Medium |
| Adaptive LRU tuning | Auto-adjust XPath cache sizes based on hit rate | Low |
| Zero-copy lexer | Avoid string slicing for token values | High |

### Formats and Protocols

| Feature | Description | Effort |
|---------|-------------|--------|
| XML Schema to OpenAPI 3.x | Generate OpenAPI schemas from XSD | High |
| RELAX NG support | Alternative schema language parser and validator | High |
| DTD full validation | Validate against DTD (currently only entity expansion) | Medium |
| XML catalogs (OASIS) | Resolve xs:import via XML catalog files | Medium |
| GZipped schema support | Parse .xsd.gz directly | Low |

### Compliance

| Feature | Description | Effort |
|---------|-------------|--------|
| W3C XSD test suite runner | Automated conformance testing | Medium |
| Compliance dashboard | Track XSD 1.0 / XSD 1.1 / XPath 2.0 percentage | Low |
| libxml2 differential testing | Cross-validate against libxml2 for correctness | Medium |
| XSD 1.1 full support | Open content, version control, assertions | Very High |

---

## Architecture Goals (v2.x)

| Goal | Description                                                                   |
|------|-------------------------------------------------------------------------------|
| DOM optional | Streaming-first — DOM built on demand only                                    |
| DFA core | Content-model validation via pre-compiled DFA per complex type ✅ Done in v1.7 |
| PSVI completeness | Full PSVI for all node types                                                  |
| Zero-copy parsing | Avoid unnecessary string allocation in lexer                                  |
| Formal compliance | Track W3C XSD test suite pass rate                                            |
| Independent modules | Parser / Schema / XPath publishable as separate packages                      |


---

## v1.9 — XPath and Schema Completion

| Item | Ref | Priority |
|------|-----|----------|
| XPath 2.0 execution model formalization (lazy iterator) | G33/L0 | P0 |
| XPath 2.0 remaining functions (date/time, node-construction) | G6 | P1 |
| Formal XsdType interface (derive, normalize, validate) | G32 | P1 |
| PSVI completeness for all 44 built-in types | G29 | P1 |
| Canonical XML 1.1 support | — | P2 |
| XPath predicate optimization (index-backed attribute lookup) | — | P2 |

---

## v2.0 — Full XSD 1.0 Compliance

| Item | Ref | Priority |
|------|-----|----------|
| Streaming-first architecture (DOM fully optional) | G1 | P0 |
| Substitution group full implementation | — | P1 |
| Abstract type enforcement | — | P1 |
| xsi:type runtime type substitution | — | P1 |
| Namespace-qualified schema element matching | G35 | P1 |
| elementFormDefault / attributeFormDefault rules | — | P2 |
| xs:notation | — | P3 |

---

## v2.1 — Advanced Features

| Item | Ref | Priority |
|------|-----|----------|
| Schematron (ISO/IEC 19757-3) validation | G7 | P1 |
| XSD 1.1 xs:assert with XPath 2.0 | G24 | P1 |
| XSD 1.1 conditional type assignment | — | P2 |
| XML Digital Signatures (xmldsig) via C14N and SHA-256 | — | P2 |
| Schema-aware XML diff | G15 | P2 |
| VS Code extension via LSP | G15 | P3 |

---

## Good-To-Have Features

### Developer Experience

| Feature | Description | Effort |
|---------|-------------|--------|
| Schema registry | Named schema repository: `registry.get("invoice-v2")` | Medium |
| Auto-complete hints | From schema, return valid child element list for a given path | Medium |
| XSD documentation generator | Extract xs:annotation/xs:documentation to HTML/Markdown | Low |
| Interactive schema explorer | HTML visualization of schema type hierarchy | High |
| Jest matcher | `expect(xml).toBeValidAgainst(schema)` | Low |
| ESLint plugin | Lint XML-generating code against schema | Medium |
| GitHub Action | Ready-made action for xml-validate in CI | Low |
| Web playground | Browser-based REPL for XML validation | Medium |

### Performance

| Feature | Description | Effort |
|---------|-------------|--------|
| Worker thread batch validation | Parallel validation using worker_threads | Medium |
| WASM build | Near-native speed in browsers and serverless | High |
| Schema compilation caching to disk | Persist compiled schemas between Node.js runs | Medium |
| Adaptive LRU tuning | Auto-adjust XPath cache sizes based on hit rate | Low |
| Zero-copy lexer | Avoid string slicing for token values | High |

### Formats and Protocols

| Feature | Description | Effort |
|---------|-------------|--------|
| XML Schema to OpenAPI 3.x | Generate OpenAPI schemas from XSD | High |
| RELAX NG support | Alternative schema language parser and validator | High |
| DTD full validation | Validate against DTD (currently only entity expansion) | Medium |
| XML catalogs (OASIS) | Resolve xs:import via XML catalog files | Medium |
| GZipped schema support | Parse .xsd.gz directly | Low |

### Compliance

| Feature | Description | Effort |
|---------|-------------|--------|
| W3C XSD test suite runner | Automated conformance testing | Medium |
| Compliance dashboard | Track XSD 1.0 / XSD 1.1 / XPath 2.0 percentage | Low |
| libxml2 differential testing | Cross-validate against libxml2 for correctness | Medium |
| XSD 1.1 full support | Open content, version control, assertions | Very High |

---

## Architecture Goals (v2.x)

| Goal | Description |
|------|-------------|
| DOM optional | Streaming-first — DOM built on demand only |
| DFA core | Content-model validation via pre-compiled DFA per complex type |
| PSVI completeness | Full PSVI for all node types |
| Zero-copy parsing | Avoid unnecessary string allocation in lexer |
| Formal compliance | Track W3C XSD test suite pass rate |
| Independent modules | Parser / Schema / XPath publishable as separate packages |
