AI RAG Source Map
Scope
This document maps the real knowledge and retrieval surfaces already present in the repository that could support an AI assistant or RAG workflow.
It distinguishes between:
Resolving locale, route permissions, and workspace projection.
Current scope: Guest
Category: 10_normative | Version: v1.0.0
Owner: DOCUMENT_CUSTODIAN | Review cycle: 90 days
Approval authority: GOVERNANCE_ADMIN
Documentation portal is read-only. Editing and mutation endpoints are disabled.
Kvary platform is originally created in Georgian. Where a Georgian version exists, Georgian is authoritative for platform UI, documentation, and legal interpretation.
Translations into other languages are provided for convenience. Some records may originate in other languages and carry their own source or legal locale for a specific flow, but where a Georgian version is available, the Georgian version prevails for platform-level wording and interpretation.
Metadata incomplete: Document ID, Version, Status, Owner Role, Last Review Date, Next Review Date, Change Log
This document maps the real knowledge and retrieval surfaces already present in the repository that could support an AI assistant or RAG workflow.
It distinguishes between:
| Source | Location | Verified | State | Queryability | Surface Type | Notes |
| --- | --- | --- | --- | --- | --- | --- |
| Managed docs corpus | docs/** via apps/web/src/lib/docs/documentRegistry.ts and apps/web/lib/docsRagBootstrap.ts | VERIFIED | IMPLEMENTED | already queryable | KNOWLEDGE SURFACE | Current assistant retrieval already uses this source. |
| Document metadata registry | apps/web/src/lib/docs/documentRegistry.ts | VERIFIED | IMPLEMENTED | already queryable | KNOWLEDGE SURFACE | Strong metadata for category, visibility, audience, status, tags. |
| Governance/document manifests | docs/_manifest/documents.manifest.json, docs/_manifest/governance.manifest.json | VERIFIED | IMPLEMENTED | partially queryable | KNOWLEDGE SURFACE / DOC-ONLY | Structured metadata exists and is RAG-friendly, but not all manifest structure is yet used by the assistant path. |
| AI governance policy corpus | docs/10_normative/KVARY_AI_* | VERIFIED | IMPLEMENTED | already queryable | KNOWLEDGE SURFACE / DOC-ONLY | Strong source for guardrails, memory, audit, and policy-grounded answers. |
| Architecture and procedure docs | docs/CURRENT_ARCHITECTURE.md, docs/EVENT_DRIVEN_ARCHITECTURE.md, governance procedures | VERIFIED | IMPLEMENTED | already queryable | KNOWLEDGE SURFACE / DOC-ONLY | Already inside the docs knowledge surface if registered/read by the docs bootstrap. |
| Butkhuzi norms rows | services/svc-butkhuzi/src/butkhuzi/repository.ts and API/gateway routes | VERIFIED | IMPLEMENTED | already queryable | KNOWLEDGE SURFACE | Structured norms list/browse surface. |
| Butkhuzi suggest | GET /butkhuzi/suggest | VERIFIED | IMPLEMENTED | already queryable | KNOWLEDGE SURFACE | Useful for assistant autocomplete and intent narrowing. |
| Butkhuzi chunk search | GET /butkhuzi/search | VERIFIED | IMPLEMENTED | already queryable | KNOWLEDGE SURFACE | Strongest domain-semantic retrieval surface in the repo. |
| Butkhuzi ingestion/chunk rebuild | POST /butkhuzi/upsert, POST /butkhuzi/chunks/rebuild | VERIFIED | IMPLEMENTED | operationally queryable | INFRA SURFACE / KNOWLEDGE SURFACE | Real corpus maintenance path for assistant-quality norms retrieval. |
| Docs access/acknowledgement history | apps/web/src/lib/docs/documentAccessLog.ts | VERIFIED | IMPLEMENTED | partially queryable | KNOWLEDGE SURFACE / INFRA SURFACE | Could support assistant memory or compliance awareness later. |
| AI audit history | apps/web/var/log/ai-audit.jsonl and AI audit routes | VERIFIED | IMPLEMENTED | partially queryable | KNOWLEDGE SURFACE / INFRA SURFACE | Good for traceability and policy review, not yet broad answer grounding. |
| Founder chat memory | packages/memory-layer/* | VERIFIED | IMPLEMENTED | partially queryable | INFRA SURFACE / KNOWLEDGE SURFACE | Recent conversation recall exists, but not advanced semantic memory retrieval. |
| Evidence attachments / declaration evidence | services/svc-tenders/src/evidenceStorage.ts and evidence APIs | VERIFIED | IMPLEMENTED | partially queryable | KNOWLEDGE SURFACE | Operational evidence exists, but no assistant-oriented indexing or retrieval layer was verified. |
| Structured service APIs | services/* and services/api/* | VERIFIED | PARTIAL | partially queryable | KNOWLEDGE SURFACE | Rich operational data exists, but not normalized into assistant context retrieval. |
| KES traceability surface | apps/web/src/features/kesTrace/* | VERIFIED | PARTIAL | partially queryable | UI-ONLY / KNOWLEDGE SURFACE | Product surface exists, but current trace data is not a verified backend knowledge substrate. |
| Event catalog docs | docs/80_chain/EVENT_CATALOG.md | VERIFIED | IMPLEMENTED | already queryable through docs | KNOWLEDGE SURFACE / DOC-ONLY | Useful as documented event knowledge, not as live event-history retrieval. |
| Kafka/outbox/event history | services/*/src/kafka/* | VERIFIED | PARTIAL | not yet queryable for assistant use | INFRA SURFACE / KNOWLEDGE SURFACE | Real backbone exists, but no assistant-facing retrieval abstraction was verified. |
| Persistent vector knowledge store | repo-wide | VERIFIED | MISSING | not yet queryable | INFRA SURFACE | No generalized persisted vector DB for assistant knowledge was verified. |
Current path:
docs/**documentRegistry.tsreadDocumentMarkdown(...)docsRagBootstrap.tsInMemoryRagVectorStoreretrieveRagMatches(...)/api/ai/askStatus:
Limits:
Current path:
svc-butkhuzi routesStatus:
Strength:
Current path:
/api/ai/askcreateSession(...)appendMessage(...)getRecentMessages(...)Status:
Limits:
Current path:
Status:
Usefulness:
Why:
Why:
Why:
Why:
Why:
Why:
Why:
Why:
Why:
The current RAG foundation is best described as:
This means the platform is closer to:
than to:
Normalize the first multi-source assistant retrieval layer around:
Then make /api/ai/ask source-aware across both.
Minimum credible direction:
The repo already has real RAG ingredients, but the usable assistant-ready sources are concentrated in:
Everything else is either partial, infrastructural, or only conceptually ready.