/** * Entity tagger — regex-based proper-noun / structured-token extractor. * * Used by the `hybridSearch` controller as a third RRF arm alongside the * dense (vector) and sparse (BM25/FTS5) signals (ADR-147, ruvnet/ruflo#2317). * * Entity matching is a distinct signal because BM25 weights documents by * overall token frequency; a per-entity exact match avoids downweighting * for tokens that happen to be common but mention an entity by name. * Example: querying "Alice OAuth tokens" — BM25 may rank a doc about * generic OAuth above one mentioning Alice specifically; entity arm * surfaces the Alice doc independently. * * P1 is regex-only — no NLP model dependency. Tags: * - Emails: foo@bar.com * - URLs: http(s)://... * - File paths: ./foo/bar.ts, C:\foo\bar.ts, src/foo * - Quoted phrases: "..." or '...' * - Proper-noun 2-grams: "Alice Smith", "Acme Corp" * * Deliberately conservative — false negatives are fine (the dense + sparse * arms still cover the query); false positives would dilute the RRF score * by adding noise rows. A future P2 can swap this for a CRF/spaCy tagger * behind the same `extractEntities(text)` contract. */ /** * Extract entity-like tokens from free text. Returns a unique list, * trimmed and deduplicated, in extraction order. */ export declare function extractEntities(text: string): string[]; //# sourceMappingURL=entity-tagger.d.ts.map