11 KiB
Foundation Gateway Roadmap
Last updated: 2026-04-29
Decision
Do not fork GenieHive for the Foundation AI gateway work. Implement the feature set as an optional hardening profile on top of the existing local-first control plane.
The core project should continue to support casual deployment:
- local model services remain first-class
- static
client_api_keysandnode_api_keysremain supported - empty key lists can still disable auth for development
- audit logging, named keys, quotas, provider accounts, and admin endpoints are opt-in
Foundation deployments should enable stricter controls through config, role catalogs, and operator documentation.
Design Principle
Separate mechanism from policy.
Core GenieHive mechanisms:
- authenticate a client and attach a request identity
- route OpenAI-compatible requests through roles and services
- optionally record audit metadata without prompt or completion content
- optionally enforce model and operation scopes
- optionally route to external provider-backed services
- optionally summarize usage and enforce budgets
Foundation policy:
- who may receive a key
- what models and roles are approved
- what budgets apply
- what provider accounts are used
- how requests are reviewed before public publication
- how emergency disable and key rotation are performed
Compatibility Contract
Every Foundation hardening change must preserve these behaviors unless a config explicitly opts into stricter operation:
- Existing
configs/control.example.yamlcontinues to load. - Existing static
auth.client_api_keyscontinues to authorize requests. - Existing node registration keys continue to work.
- Existing role catalogs continue to route without client allowlists.
GET /v1/models, chat, embeddings, transcription, and cluster inspection remain available in casual deployments.- No provider credentials are required for local-only deployment.
- Admin endpoints are disabled unless admin authentication is configured.
Profiles
Casual Profile
The casual profile is the default shape of GenieHive.
Expected traits:
- local or LAN-bound control plane
- static shared client key, or no auth during isolated development
- no audit log by default
- no budget enforcement
- no provider credential store
- no admin API exposed by default
Foundation Gateway Profile
The Foundation gateway profile is an opt-in deployment mode for managed access to local and paid AI services.
Expected traits:
- named, revocable client credentials
- request audit log without prompt or completion content
- model and operation allowlists per key
- Foundation-owned provider account indirection
- optional budget and quota enforcement
- migration-specific role catalogs
- operator and board-readable governance documentation
Configuration Shape
The final config shape may evolve, but the intended compatibility model is:
deployment_profile: "casual"
auth:
client_api_keys:
- "change-me-client-key"
node_api_keys:
- "change-me-node-key"
enable_named_client_keys: false
key_hash_secret_env: "GENIEHIVE_KEY_HASH_SECRET"
audit:
enabled: false
admin_api:
enabled: false
authorization:
enforce_model_allowlists: false
enforce_operation_allowlists: false
empty_allowlist_means_no_access: true
providers: []
budgeting:
enabled: false
Foundation example configs can switch these flags on. Casual example configs should stay short and understandable.
Revised Milestones
M0: Baseline and Compatibility Guard
Goal: record the current behavior and make compatibility explicit before adding governance features.
Tasks:
- Add
docs/foundation_gateway_baseline.md. - Record current commit, test command, existing exposed ports, and supported casual deployment behavior.
- Add or preserve tests proving
configs/control.example.yamlstill loads and staticX-Api-Keyauth still works.
Acceptance:
- Baseline document exists.
- Current test suite passes or failures are documented.
- Compatibility contract is visible in docs.
M1: Config Profiles and Feature Flags
Goal: introduce opt-in switches without changing runtime behavior.
Tasks:
- Add config models for
deployment_profile,audit,admin_api,authorization,providers, andbudgeting. - Keep default values equivalent to current casual behavior.
- Add a Foundation example config skeleton.
- Add tests for default values and legacy config loading.
Acceptance:
- Existing configs load unchanged.
- New config sections are accepted.
- No governance feature activates by default.
M2: Named Client Credentials
Goal: support named, revocable API keys while keeping static keys working.
Tasks:
- Add
ClientContextwith principal metadata. - Add API key generation, hashing, verification, and redaction helpers.
- Add a
client_keysSQLite table. - Add registry methods to create, list, disable, enable, and touch keys.
- Support named keys only when
auth.enable_named_client_keysis true. - Preserve static
auth.client_api_keys.
Acceptance:
- Static keys still work.
- Named keys work through
X-Api-Keywhen enabled. - Disabled named keys fail.
- Raw keys are never stored.
- Request handlers can read authenticated client context.
M3: Request Audit Log
Goal: make production requests attributable without storing prompt or completion content.
Status: implemented for chat, embeddings, and transcription request wrappers.
Audit logging is disabled by default and enabled by audit.enabled. Admin audit
read endpoints are only mounted when admin_api.enabled is true.
Tasks:
- Add request ID generation from
X-Request-Idor UUID. - Add
request_audit_logSQLite table. - Record identity, operation, requested model, resolved service, upstream model, provider kind, status, duration, token usage when available, estimated cost when available, and error category.
- Add admin-only query and summary endpoints, disabled unless admin API is enabled.
Acceptance:
- Chat, embeddings, and transcription requests create audit rows when enabled.
- Prompt and completion content are not logged.
- Failed routing and upstream errors are logged.
- Casual deployments have no audit behavior unless enabled.
M4: Model and Operation Authorization
Goal: let Foundation keys be limited to approved roles, models, and operations.
Status: implemented for named client keys. Enforcement is controlled by
authorization.enforce_model_allowlists and
authorization.enforce_operation_allowlists. Static and development auth retain
casual-deployment behavior.
Tasks:
- Add allowed models and allowed operations to named keys.
- Enforce operation scopes only when authorization enforcement is enabled.
- Support exact model IDs and conservative glob patterns such as
local/*,openai/*,anthropic/*, androle/*. - Prefer role IDs for migration workflows.
Acceptance:
- A chat-only key cannot call embeddings when enforcement is enabled.
- A key restricted to
archive_migratorcannot call unrelated roles. - Legacy static keys are unaffected unless explicitly mapped into stricter mode.
M5: Archive Migration Profile
Goal: support TalkOrigins/SciSiteForge-style migration without direct provider keys in migration scripts.
Tasks:
- Add
configs/roles.foundation.archive.yaml. - Add roles such as
archive_migrator,archive_metadata_extractor,archive_link_reviewer,archive_copyeditor, andarchive_factcheck_assistant. - Add
configs/control.foundation.example.yaml. - Add
configs/clients/archive_migration.example.env. - Add a smoke script that calls
archive_migratorthrough the OpenAI-compatible facade.
Acceptance:
- A migration client only needs
GENIEHIVE_BASE_URL,GENIEHIVE_API_KEY, andGENIEHIVE_MODEL. - The requested model is a role, not a provider-specific model.
- Local-only provider routing remains possible.
M6: Provider Credential Indirection
Goal: keep paid provider credentials out of role configs, node configs, and client scripts.
Tasks:
- Add provider config entries using environment variables first.
- Add external/provider-backed service registration without requiring node heartbeat.
- Resolve provider headers centrally in the upstream layer.
- Keep provider credential storage optional; encrypted-at-rest credentials can be deferred.
Acceptance:
- Provider keys are loaded from environment variables, not committed YAML.
- Provider-backed services can be routed like local services.
- Local-only deployments do not need provider sections.
M7: Anthropic Messages Adapter
Goal: expose Anthropic models through the existing OpenAI-compatible chat facade.
Tasks:
- Add provider protocol dispatch in
UpstreamClient. - Transform OpenAI-shaped messages into Anthropic Messages requests.
- Transform Anthropic responses back to OpenAI-compatible chat completions.
- Reject Anthropic streaming clearly until implemented.
Acceptance:
- A chat request can route to an Anthropic-backed service.
- System messages and usage fields are mapped correctly.
- Unsupported streaming fails with a specific error.
M8: Budget and Quota Enforcement
Goal: prevent accidental provider overspend.
Tasks:
- Add budget config with disabled default.
- Use audit summaries to calculate monthly usage.
- Add request, token, and estimated-cost limits per key, provider, and globally.
- Add configurable price maps.
Acceptance:
- Requests over configured limits are denied before upstream calls.
- Unknown-cost behavior is configurable.
- Casual deployments do not perform budget checks.
M9: Admin CLI and Operations Docs
Goal: make managed operation scriptable and understandable.
Tasks:
- Add
geniehive-adminCLI for create/list/disable/enable keys and usage summaries. - Add Foundation docs for gateway operation, provider accounts, key management, archive migration workflow, and emergency disable.
- Document when provider-native seats are needed instead of GenieHive routing.
Acceptance:
- A new operator can provision and revoke a user key without editing SQLite.
- A board-facing control summary explains ownership, auditability, and budget control.
M10: Security Review
Goal: make the Foundation profile safe to expose beyond localhost.
Tasks:
- Add a security checklist covering provider keys, admin auth, content logging, CORS, TLS/reverse proxy, backup/restore, rate limits, and emergency disable.
- Implement critical checklist items or explicitly defer with issue references.
- Keep WAN and zero-trust networking as deployment concerns unless a concrete need appears.
Acceptance:
- Security checklist exists.
- Critical production risks have implementation or documented mitigations.
Initial Implementation Order
- M0: Baseline and compatibility guard.
- M1: Config profiles and feature flags.
- M2: Named client credentials.
- M3: Request audit log.
- M4: Model and operation authorization.
- M5: Archive migration profile.
- M6: Provider credential indirection.
- M7: Anthropic Messages adapter.
- M8: Budget and quota enforcement.
- M9: Admin CLI and operations docs.
- M10: Security review.
This order lets local-only and TalkOrigins migration pilots start before paid provider routing and budget controls are complete.