# Foundation Gateway Roadmap Last updated: 2026-04-29 ## Decision Do not fork GenieHive for the Foundation AI gateway work. Implement the feature set as an optional hardening profile on top of the existing local-first control plane. The core project should continue to support casual deployment: - local model services remain first-class - static `client_api_keys` and `node_api_keys` remain supported - empty key lists can still disable auth for development - audit logging, named keys, quotas, provider accounts, and admin endpoints are opt-in Foundation deployments should enable stricter controls through config, role catalogs, and operator documentation. ## Design Principle Separate mechanism from policy. Core GenieHive mechanisms: - authenticate a client and attach a request identity - route OpenAI-compatible requests through roles and services - optionally record audit metadata without prompt or completion content - optionally enforce model and operation scopes - optionally route to external provider-backed services - optionally summarize usage and enforce budgets Foundation policy: - who may receive a key - what models and roles are approved - what budgets apply - what provider accounts are used - how requests are reviewed before public publication - how emergency disable and key rotation are performed ## Compatibility Contract Every Foundation hardening change must preserve these behaviors unless a config explicitly opts into stricter operation: 1. Existing `configs/control.example.yaml` continues to load. 2. Existing static `auth.client_api_keys` continues to authorize requests. 3. Existing node registration keys continue to work. 4. Existing role catalogs continue to route without client allowlists. 5. `GET /v1/models`, chat, embeddings, transcription, and cluster inspection remain available in casual deployments. 6. No provider credentials are required for local-only deployment. 7. Admin endpoints are disabled unless admin authentication is configured. ## Profiles ### Casual Profile The casual profile is the default shape of GenieHive. Expected traits: - local or LAN-bound control plane - static shared client key, or no auth during isolated development - no audit log by default - no budget enforcement - no provider credential store - no admin API exposed by default ### Foundation Gateway Profile The Foundation gateway profile is an opt-in deployment mode for managed access to local and paid AI services. Expected traits: - named, revocable client credentials - request audit log without prompt or completion content - model and operation allowlists per key - Foundation-owned provider account indirection - optional budget and quota enforcement - migration-specific role catalogs - operator and board-readable governance documentation ## Configuration Shape The final config shape may evolve, but the intended compatibility model is: ```yaml deployment_profile: "casual" auth: client_api_keys: - "change-me-client-key" node_api_keys: - "change-me-node-key" enable_named_client_keys: false key_hash_secret_env: "GENIEHIVE_KEY_HASH_SECRET" audit: enabled: false admin_api: enabled: false authorization: enforce_model_allowlists: false enforce_operation_allowlists: false empty_allowlist_means_no_access: true providers: [] budgeting: enabled: false ``` Foundation example configs can switch these flags on. Casual example configs should stay short and understandable. ## Revised Milestones ### M0: Baseline and Compatibility Guard Goal: record the current behavior and make compatibility explicit before adding governance features. Tasks: - Add `docs/foundation_gateway_baseline.md`. - Record current commit, test command, existing exposed ports, and supported casual deployment behavior. - Add or preserve tests proving `configs/control.example.yaml` still loads and static `X-Api-Key` auth still works. Acceptance: - Baseline document exists. - Current test suite passes or failures are documented. - Compatibility contract is visible in docs. ### M1: Config Profiles and Feature Flags Goal: introduce opt-in switches without changing runtime behavior. Tasks: - Add config models for `deployment_profile`, `audit`, `admin_api`, `authorization`, `providers`, and `budgeting`. - Keep default values equivalent to current casual behavior. - Add a Foundation example config skeleton. - Add tests for default values and legacy config loading. Acceptance: - Existing configs load unchanged. - New config sections are accepted. - No governance feature activates by default. ### M2: Named Client Credentials Goal: support named, revocable API keys while keeping static keys working. Tasks: - Add `ClientContext` with principal metadata. - Add API key generation, hashing, verification, and redaction helpers. - Add a `client_keys` SQLite table. - Add registry methods to create, list, disable, enable, and touch keys. - Support named keys only when `auth.enable_named_client_keys` is true. - Preserve static `auth.client_api_keys`. Acceptance: - Static keys still work. - Named keys work through `X-Api-Key` when enabled. - Disabled named keys fail. - Raw keys are never stored. - Request handlers can read authenticated client context. ### M3: Request Audit Log Goal: make production requests attributable without storing prompt or completion content. Status: implemented for chat, embeddings, and transcription request wrappers. Audit logging is disabled by default and enabled by `audit.enabled`. Admin audit read endpoints are only mounted when `admin_api.enabled` is true. Tasks: - Add request ID generation from `X-Request-Id` or UUID. - Add `request_audit_log` SQLite table. - Record identity, operation, requested model, resolved service, upstream model, provider kind, status, duration, token usage when available, estimated cost when available, and error category. - Add admin-only query and summary endpoints, disabled unless admin API is enabled. Acceptance: - Chat, embeddings, and transcription requests create audit rows when enabled. - Prompt and completion content are not logged. - Failed routing and upstream errors are logged. - Casual deployments have no audit behavior unless enabled. ### M4: Model and Operation Authorization Goal: let Foundation keys be limited to approved roles, models, and operations. Tasks: - Add allowed models and allowed operations to named keys. - Enforce operation scopes only when authorization enforcement is enabled. - Support exact model IDs and conservative glob patterns such as `local/*`, `openai/*`, `anthropic/*`, and `role/*`. - Prefer role IDs for migration workflows. Acceptance: - A chat-only key cannot call embeddings when enforcement is enabled. - A key restricted to `archive_migrator` cannot call unrelated roles. - Legacy static keys are unaffected unless explicitly mapped into stricter mode. ### M5: Archive Migration Profile Goal: support TalkOrigins/SciSiteForge-style migration without direct provider keys in migration scripts. Tasks: - Add `configs/roles.foundation.archive.yaml`. - Add roles such as `archive_migrator`, `archive_metadata_extractor`, `archive_link_reviewer`, `archive_copyeditor`, and `archive_factcheck_assistant`. - Add `configs/control.foundation.example.yaml`. - Add `configs/clients/archive_migration.example.env`. - Add a smoke script that calls `archive_migrator` through the OpenAI-compatible facade. Acceptance: - A migration client only needs `GENIEHIVE_BASE_URL`, `GENIEHIVE_API_KEY`, and `GENIEHIVE_MODEL`. - The requested model is a role, not a provider-specific model. - Local-only provider routing remains possible. ### M6: Provider Credential Indirection Goal: keep paid provider credentials out of role configs, node configs, and client scripts. Tasks: - Add provider config entries using environment variables first. - Add external/provider-backed service registration without requiring node heartbeat. - Resolve provider headers centrally in the upstream layer. - Keep provider credential storage optional; encrypted-at-rest credentials can be deferred. Acceptance: - Provider keys are loaded from environment variables, not committed YAML. - Provider-backed services can be routed like local services. - Local-only deployments do not need provider sections. ### M7: Anthropic Messages Adapter Goal: expose Anthropic models through the existing OpenAI-compatible chat facade. Tasks: - Add provider protocol dispatch in `UpstreamClient`. - Transform OpenAI-shaped messages into Anthropic Messages requests. - Transform Anthropic responses back to OpenAI-compatible chat completions. - Reject Anthropic streaming clearly until implemented. Acceptance: - A chat request can route to an Anthropic-backed service. - System messages and usage fields are mapped correctly. - Unsupported streaming fails with a specific error. ### M8: Budget and Quota Enforcement Goal: prevent accidental provider overspend. Tasks: - Add budget config with disabled default. - Use audit summaries to calculate monthly usage. - Add request, token, and estimated-cost limits per key, provider, and globally. - Add configurable price maps. Acceptance: - Requests over configured limits are denied before upstream calls. - Unknown-cost behavior is configurable. - Casual deployments do not perform budget checks. ### M9: Admin CLI and Operations Docs Goal: make managed operation scriptable and understandable. Tasks: - Add `geniehive-admin` CLI for create/list/disable/enable keys and usage summaries. - Add Foundation docs for gateway operation, provider accounts, key management, archive migration workflow, and emergency disable. - Document when provider-native seats are needed instead of GenieHive routing. Acceptance: - A new operator can provision and revoke a user key without editing SQLite. - A board-facing control summary explains ownership, auditability, and budget control. ### M10: Security Review Goal: make the Foundation profile safe to expose beyond localhost. Tasks: - Add a security checklist covering provider keys, admin auth, content logging, CORS, TLS/reverse proxy, backup/restore, rate limits, and emergency disable. - Implement critical checklist items or explicitly defer with issue references. - Keep WAN and zero-trust networking as deployment concerns unless a concrete need appears. Acceptance: - Security checklist exists. - Critical production risks have implementation or documented mitigations. ## Initial Implementation Order 1. M0: Baseline and compatibility guard. 2. M1: Config profiles and feature flags. 3. M2: Named client credentials. 4. M3: Request audit log. 5. M4: Model and operation authorization. 6. M5: Archive migration profile. 7. M6: Provider credential indirection. 8. M7: Anthropic Messages adapter. 9. M8: Budget and quota enforcement. 10. M9: Admin CLI and operations docs. 11. M10: Security review. This order lets local-only and TalkOrigins migration pilots start before paid provider routing and budget controls are complete.