GroundRecall/docs/architecture.md

94 lines
2.3 KiB
Markdown

# Architecture
`GroundRecall` is the grounded knowledge substrate in a larger stack:
- `GroundRecall`: canonical knowledge ingestion, promotion, query, export, and future sync
- `Didactopus`: learner-facing workflows and educational tooling
- `GenieHive`: model and routing layer where runtime assistant/service resolution is needed
## Core Design
The system is built around one canonical flow:
1. ingest weakly structured sources
2. normalize them into stable knowledge objects
3. lint and queue them for review
4. promote reviewed objects into a canonical store
5. query and export promoted state
## Core Objects
The canonical store is built from these object families:
- `Source`
- `Fragment`
- `Artifact`
- `Observation`
- `Claim`
- `Concept`
- `Relation`
- `ReviewCandidate`
- `PromotionRecord`
- `GroundRecallSnapshot`
These objects are assistant-neutral. Assistant-specific formatting belongs at the adapter layer.
## Package Surface
The main standalone package surface is:
- `groundrecall.ingest`
- `groundrecall.lint`
- `groundrecall.models`
- `groundrecall.store`
- `groundrecall.promotion`
- `groundrecall.query`
- `groundrecall.export`
- `groundrecall.assistant_export`
- `groundrecall.inspect`
- `groundrecall.source_adapters.*`
- `groundrecall.assistants.*`
There are also compatibility-style helper modules prefixed with `groundrecall_` inside the package. Those exist because the standalone repo was extracted from an earlier monorepo layout.
## Source Adapters
Adapters handle source-shape-specific discovery and mapping while the downstream pipeline stays generic.
Current adapter families include:
- `llmwiki`
- `markdown_notes`
- `transcript`
- `didactopus_pack`
## Assistant Boundary
Assistant integration is intentionally outside the core store and query semantics.
The rule is:
- core `GroundRecall` owns truth, provenance, lifecycle, and retrieval semantics
- assistant adapters own presentation, bundle shaping, and tool-specific exports
Current adapters include:
- `codex`
- `claude_code`
## Alpha Boundary
The current alpha is strong enough for:
- local import and promotion
- canonical query and export
- assistant-neutral bundles
- assistant-targeted bundle generation
It is not yet complete for:
- multi-node sync and merge
- re-import/update semantics
- richer review adjudication
- large-scale distributed corpus integration