Compare commits
No commits in common. "960aa11d93a7b6507ed472752c0beeb64fba6690" and "2355cf8114db5a1ac4630ca22aba63c703553f70" have entirely different histories.
960aa11d93
...
2355cf8114
|
|
@ -1,59 +0,0 @@
|
||||||
deployment_profile: "foundation_gateway"
|
|
||||||
|
|
||||||
server:
|
|
||||||
host: "127.0.0.1"
|
|
||||||
port: 8800
|
|
||||||
|
|
||||||
auth:
|
|
||||||
# Keep a break-glass/static admin key only for initial provisioning or recovery.
|
|
||||||
client_api_keys:
|
|
||||||
- "change-me-foundation-admin-key"
|
|
||||||
node_api_keys:
|
|
||||||
- "change-me-node-key"
|
|
||||||
enable_named_client_keys: true
|
|
||||||
key_hash_secret_env: "GENIEHIVE_KEY_HASH_SECRET"
|
|
||||||
|
|
||||||
audit:
|
|
||||||
enabled: true
|
|
||||||
|
|
||||||
admin_api:
|
|
||||||
enabled: true
|
|
||||||
|
|
||||||
authorization:
|
|
||||||
enforce_model_allowlists: true
|
|
||||||
enforce_operation_allowlists: true
|
|
||||||
empty_allowlist_means_no_access: true
|
|
||||||
|
|
||||||
storage:
|
|
||||||
sqlite_path: "state/geniehive.foundation.sqlite3"
|
|
||||||
|
|
||||||
roles_path: "configs/roles.foundation.archive.yaml"
|
|
||||||
|
|
||||||
routing:
|
|
||||||
health_stale_after_s: 30
|
|
||||||
default_strategy: "scored"
|
|
||||||
|
|
||||||
providers:
|
|
||||||
# Provider-backed services are optional. Keep API keys in environment variables,
|
|
||||||
# not in YAML or client scripts.
|
|
||||||
- provider_id: "openai-foundation"
|
|
||||||
provider_kind: "openai_compatible"
|
|
||||||
base_url: "https://api.openai.com"
|
|
||||||
api_key_env: "OPENAI_API_KEY"
|
|
||||||
enabled: false
|
|
||||||
- provider_id: "anthropic-foundation"
|
|
||||||
provider_kind: "anthropic_messages"
|
|
||||||
base_url: "https://api.anthropic.com"
|
|
||||||
api_key_env: "ANTHROPIC_API_KEY"
|
|
||||||
default_headers:
|
|
||||||
anthropic-version: "2023-06-01"
|
|
||||||
enabled: false
|
|
||||||
|
|
||||||
budgeting:
|
|
||||||
enabled: false
|
|
||||||
reset_day_of_month: 1
|
|
||||||
global_monthly_budget_cents: 5000
|
|
||||||
provider_monthly_budget_cents:
|
|
||||||
openai-foundation: 3000
|
|
||||||
anthropic-foundation: 3000
|
|
||||||
deny_on_unknown_cost: false
|
|
||||||
|
|
@ -1,73 +0,0 @@
|
||||||
# Foundation Gateway Baseline
|
|
||||||
|
|
||||||
Last updated: 2026-04-29
|
|
||||||
|
|
||||||
## Repository State
|
|
||||||
|
|
||||||
- Repository: `/home/netuser/bin/geniehive`
|
|
||||||
- Baseline commit: `2355cf8114db5a1ac4630ca22aba63c703553f70`
|
|
||||||
- Branch: `main`
|
|
||||||
|
|
||||||
## Current Capability Snapshot
|
|
||||||
|
|
||||||
GenieHive is currently a local-first control plane for heterogeneous generative
|
|
||||||
AI services. It already supports:
|
|
||||||
|
|
||||||
- OpenAI-compatible `GET /v1/models`
|
|
||||||
- OpenAI-compatible `POST /v1/chat/completions`
|
|
||||||
- OpenAI-compatible `POST /v1/embeddings`
|
|
||||||
- `POST /v1/audio/transcriptions` multipart proxying
|
|
||||||
- node registration and heartbeat
|
|
||||||
- SQLite-backed hosts, services, roles, and benchmark samples
|
|
||||||
- role-based route resolution
|
|
||||||
- request policy shaping
|
|
||||||
- benchmark-informed route scoring
|
|
||||||
- optional active service health probing
|
|
||||||
- static client and node API keys
|
|
||||||
|
|
||||||
## Casual Deployment Behavior To Preserve
|
|
||||||
|
|
||||||
- `configs/control.example.yaml` loads without Foundation-specific sections.
|
|
||||||
- Static `auth.client_api_keys` authorize client requests with `X-Api-Key`.
|
|
||||||
- Static `auth.node_api_keys` authorize node requests with
|
|
||||||
`X-GenieHive-Node-Key`.
|
|
||||||
- Empty client or node key lists disable that auth check for development.
|
|
||||||
- Local model servers do not require provider credential config.
|
|
||||||
- Admin endpoints, audit logging, named keys, and budget checks are not required
|
|
||||||
for a local-only deployment.
|
|
||||||
|
|
||||||
## Current Example Ports
|
|
||||||
|
|
||||||
- Control plane default: `127.0.0.1:8800`
|
|
||||||
- Node examples commonly use localhost service endpoints for Ollama,
|
|
||||||
llama.cpp, llamafile, or vLLM.
|
|
||||||
- Recent ZeroTier test deployment used control plane binding
|
|
||||||
`172.24.50.65:8800`, node `127.0.0.1:8891`, and llama.cpp
|
|
||||||
`127.0.0.1:18091`.
|
|
||||||
|
|
||||||
## Baseline Verification
|
|
||||||
|
|
||||||
Run from the repository root:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
python -m pytest -q tests
|
|
||||||
```
|
|
||||||
|
|
||||||
Expected current result at baseline: all tests pass.
|
|
||||||
|
|
||||||
Current verification result after adding the Foundation roadmap, config profile
|
|
||||||
scaffold, named client key storage, opt-in named auth, admin key endpoints, and
|
|
||||||
request audit logging, and named-key model/operation authorization:
|
|
||||||
|
|
||||||
```text
|
|
||||||
66 passed
|
|
||||||
```
|
|
||||||
|
|
||||||
## Known Constraints
|
|
||||||
|
|
||||||
- Client authentication is static-key based, not named or revocable per user.
|
|
||||||
- Request attribution is not currently persisted.
|
|
||||||
- Provider credentials are not modeled as first-class control-plane objects.
|
|
||||||
- No budget or quota enforcement exists.
|
|
||||||
- Anthropic Messages API is not natively adapted behind the OpenAI-compatible
|
|
||||||
facade.
|
|
||||||
|
|
@ -1,359 +0,0 @@
|
||||||
# Foundation Gateway Roadmap
|
|
||||||
|
|
||||||
Last updated: 2026-04-29
|
|
||||||
|
|
||||||
## Decision
|
|
||||||
|
|
||||||
Do not fork GenieHive for the Foundation AI gateway work. Implement the feature
|
|
||||||
set as an optional hardening profile on top of the existing local-first control
|
|
||||||
plane.
|
|
||||||
|
|
||||||
The core project should continue to support casual deployment:
|
|
||||||
|
|
||||||
- local model services remain first-class
|
|
||||||
- static `client_api_keys` and `node_api_keys` remain supported
|
|
||||||
- empty key lists can still disable auth for development
|
|
||||||
- audit logging, named keys, quotas, provider accounts, and admin endpoints are
|
|
||||||
opt-in
|
|
||||||
|
|
||||||
Foundation deployments should enable stricter controls through config, role
|
|
||||||
catalogs, and operator documentation.
|
|
||||||
|
|
||||||
## Design Principle
|
|
||||||
|
|
||||||
Separate mechanism from policy.
|
|
||||||
|
|
||||||
Core GenieHive mechanisms:
|
|
||||||
|
|
||||||
- authenticate a client and attach a request identity
|
|
||||||
- route OpenAI-compatible requests through roles and services
|
|
||||||
- optionally record audit metadata without prompt or completion content
|
|
||||||
- optionally enforce model and operation scopes
|
|
||||||
- optionally route to external provider-backed services
|
|
||||||
- optionally summarize usage and enforce budgets
|
|
||||||
|
|
||||||
Foundation policy:
|
|
||||||
|
|
||||||
- who may receive a key
|
|
||||||
- what models and roles are approved
|
|
||||||
- what budgets apply
|
|
||||||
- what provider accounts are used
|
|
||||||
- how requests are reviewed before public publication
|
|
||||||
- how emergency disable and key rotation are performed
|
|
||||||
|
|
||||||
## Compatibility Contract
|
|
||||||
|
|
||||||
Every Foundation hardening change must preserve these behaviors unless a config
|
|
||||||
explicitly opts into stricter operation:
|
|
||||||
|
|
||||||
1. Existing `configs/control.example.yaml` continues to load.
|
|
||||||
2. Existing static `auth.client_api_keys` continues to authorize requests.
|
|
||||||
3. Existing node registration keys continue to work.
|
|
||||||
4. Existing role catalogs continue to route without client allowlists.
|
|
||||||
5. `GET /v1/models`, chat, embeddings, transcription, and cluster inspection
|
|
||||||
remain available in casual deployments.
|
|
||||||
6. No provider credentials are required for local-only deployment.
|
|
||||||
7. Admin endpoints are disabled unless admin authentication is configured.
|
|
||||||
|
|
||||||
## Profiles
|
|
||||||
|
|
||||||
### Casual Profile
|
|
||||||
|
|
||||||
The casual profile is the default shape of GenieHive.
|
|
||||||
|
|
||||||
Expected traits:
|
|
||||||
|
|
||||||
- local or LAN-bound control plane
|
|
||||||
- static shared client key, or no auth during isolated development
|
|
||||||
- no audit log by default
|
|
||||||
- no budget enforcement
|
|
||||||
- no provider credential store
|
|
||||||
- no admin API exposed by default
|
|
||||||
|
|
||||||
### Foundation Gateway Profile
|
|
||||||
|
|
||||||
The Foundation gateway profile is an opt-in deployment mode for managed access
|
|
||||||
to local and paid AI services.
|
|
||||||
|
|
||||||
Expected traits:
|
|
||||||
|
|
||||||
- named, revocable client credentials
|
|
||||||
- request audit log without prompt or completion content
|
|
||||||
- model and operation allowlists per key
|
|
||||||
- Foundation-owned provider account indirection
|
|
||||||
- optional budget and quota enforcement
|
|
||||||
- migration-specific role catalogs
|
|
||||||
- operator and board-readable governance documentation
|
|
||||||
|
|
||||||
## Configuration Shape
|
|
||||||
|
|
||||||
The final config shape may evolve, but the intended compatibility model is:
|
|
||||||
|
|
||||||
```yaml
|
|
||||||
deployment_profile: "casual"
|
|
||||||
|
|
||||||
auth:
|
|
||||||
client_api_keys:
|
|
||||||
- "change-me-client-key"
|
|
||||||
node_api_keys:
|
|
||||||
- "change-me-node-key"
|
|
||||||
enable_named_client_keys: false
|
|
||||||
key_hash_secret_env: "GENIEHIVE_KEY_HASH_SECRET"
|
|
||||||
|
|
||||||
audit:
|
|
||||||
enabled: false
|
|
||||||
|
|
||||||
admin_api:
|
|
||||||
enabled: false
|
|
||||||
|
|
||||||
authorization:
|
|
||||||
enforce_model_allowlists: false
|
|
||||||
enforce_operation_allowlists: false
|
|
||||||
empty_allowlist_means_no_access: true
|
|
||||||
|
|
||||||
providers: []
|
|
||||||
|
|
||||||
budgeting:
|
|
||||||
enabled: false
|
|
||||||
```
|
|
||||||
|
|
||||||
Foundation example configs can switch these flags on. Casual example configs
|
|
||||||
should stay short and understandable.
|
|
||||||
|
|
||||||
## Revised Milestones
|
|
||||||
|
|
||||||
### M0: Baseline and Compatibility Guard
|
|
||||||
|
|
||||||
Goal: record the current behavior and make compatibility explicit before adding
|
|
||||||
governance features.
|
|
||||||
|
|
||||||
Tasks:
|
|
||||||
|
|
||||||
- Add `docs/foundation_gateway_baseline.md`.
|
|
||||||
- Record current commit, test command, existing exposed ports, and supported
|
|
||||||
casual deployment behavior.
|
|
||||||
- Add or preserve tests proving `configs/control.example.yaml` still loads and
|
|
||||||
static `X-Api-Key` auth still works.
|
|
||||||
|
|
||||||
Acceptance:
|
|
||||||
|
|
||||||
- Baseline document exists.
|
|
||||||
- Current test suite passes or failures are documented.
|
|
||||||
- Compatibility contract is visible in docs.
|
|
||||||
|
|
||||||
### M1: Config Profiles and Feature Flags
|
|
||||||
|
|
||||||
Goal: introduce opt-in switches without changing runtime behavior.
|
|
||||||
|
|
||||||
Tasks:
|
|
||||||
|
|
||||||
- Add config models for `deployment_profile`, `audit`, `admin_api`,
|
|
||||||
`authorization`, `providers`, and `budgeting`.
|
|
||||||
- Keep default values equivalent to current casual behavior.
|
|
||||||
- Add a Foundation example config skeleton.
|
|
||||||
- Add tests for default values and legacy config loading.
|
|
||||||
|
|
||||||
Acceptance:
|
|
||||||
|
|
||||||
- Existing configs load unchanged.
|
|
||||||
- New config sections are accepted.
|
|
||||||
- No governance feature activates by default.
|
|
||||||
|
|
||||||
### M2: Named Client Credentials
|
|
||||||
|
|
||||||
Goal: support named, revocable API keys while keeping static keys working.
|
|
||||||
|
|
||||||
Tasks:
|
|
||||||
|
|
||||||
- Add `ClientContext` with principal metadata.
|
|
||||||
- Add API key generation, hashing, verification, and redaction helpers.
|
|
||||||
- Add a `client_keys` SQLite table.
|
|
||||||
- Add registry methods to create, list, disable, enable, and touch keys.
|
|
||||||
- Support named keys only when `auth.enable_named_client_keys` is true.
|
|
||||||
- Preserve static `auth.client_api_keys`.
|
|
||||||
|
|
||||||
Acceptance:
|
|
||||||
|
|
||||||
- Static keys still work.
|
|
||||||
- Named keys work through `X-Api-Key` when enabled.
|
|
||||||
- Disabled named keys fail.
|
|
||||||
- Raw keys are never stored.
|
|
||||||
- Request handlers can read authenticated client context.
|
|
||||||
|
|
||||||
### M3: Request Audit Log
|
|
||||||
|
|
||||||
Goal: make production requests attributable without storing prompt or completion
|
|
||||||
content.
|
|
||||||
|
|
||||||
Status: implemented for chat, embeddings, and transcription request wrappers.
|
|
||||||
Audit logging is disabled by default and enabled by `audit.enabled`. Admin audit
|
|
||||||
read endpoints are only mounted when `admin_api.enabled` is true.
|
|
||||||
|
|
||||||
Tasks:
|
|
||||||
|
|
||||||
- Add request ID generation from `X-Request-Id` or UUID.
|
|
||||||
- Add `request_audit_log` SQLite table.
|
|
||||||
- Record identity, operation, requested model, resolved service, upstream model,
|
|
||||||
provider kind, status, duration, token usage when available, estimated cost
|
|
||||||
when available, and error category.
|
|
||||||
- Add admin-only query and summary endpoints, disabled unless admin API is
|
|
||||||
enabled.
|
|
||||||
|
|
||||||
Acceptance:
|
|
||||||
|
|
||||||
- Chat, embeddings, and transcription requests create audit rows when enabled.
|
|
||||||
- Prompt and completion content are not logged.
|
|
||||||
- Failed routing and upstream errors are logged.
|
|
||||||
- Casual deployments have no audit behavior unless enabled.
|
|
||||||
|
|
||||||
### M4: Model and Operation Authorization
|
|
||||||
|
|
||||||
Goal: let Foundation keys be limited to approved roles, models, and operations.
|
|
||||||
|
|
||||||
Status: implemented for named client keys. Enforcement is controlled by
|
|
||||||
`authorization.enforce_model_allowlists` and
|
|
||||||
`authorization.enforce_operation_allowlists`. Static and development auth retain
|
|
||||||
casual-deployment behavior.
|
|
||||||
|
|
||||||
Tasks:
|
|
||||||
|
|
||||||
- Add allowed models and allowed operations to named keys.
|
|
||||||
- Enforce operation scopes only when authorization enforcement is enabled.
|
|
||||||
- Support exact model IDs and conservative glob patterns such as `local/*`,
|
|
||||||
`openai/*`, `anthropic/*`, and `role/*`.
|
|
||||||
- Prefer role IDs for migration workflows.
|
|
||||||
|
|
||||||
Acceptance:
|
|
||||||
|
|
||||||
- A chat-only key cannot call embeddings when enforcement is enabled.
|
|
||||||
- A key restricted to `archive_migrator` cannot call unrelated roles.
|
|
||||||
- Legacy static keys are unaffected unless explicitly mapped into stricter mode.
|
|
||||||
|
|
||||||
### M5: Archive Migration Profile
|
|
||||||
|
|
||||||
Goal: support TalkOrigins/SciSiteForge-style migration without direct provider
|
|
||||||
keys in migration scripts.
|
|
||||||
|
|
||||||
Tasks:
|
|
||||||
|
|
||||||
- Add `configs/roles.foundation.archive.yaml`.
|
|
||||||
- Add roles such as `archive_migrator`, `archive_metadata_extractor`,
|
|
||||||
`archive_link_reviewer`, `archive_copyeditor`, and
|
|
||||||
`archive_factcheck_assistant`.
|
|
||||||
- Add `configs/control.foundation.example.yaml`.
|
|
||||||
- Add `configs/clients/archive_migration.example.env`.
|
|
||||||
- Add a smoke script that calls `archive_migrator` through the OpenAI-compatible
|
|
||||||
facade.
|
|
||||||
|
|
||||||
Acceptance:
|
|
||||||
|
|
||||||
- A migration client only needs `GENIEHIVE_BASE_URL`, `GENIEHIVE_API_KEY`, and
|
|
||||||
`GENIEHIVE_MODEL`.
|
|
||||||
- The requested model is a role, not a provider-specific model.
|
|
||||||
- Local-only provider routing remains possible.
|
|
||||||
|
|
||||||
### M6: Provider Credential Indirection
|
|
||||||
|
|
||||||
Goal: keep paid provider credentials out of role configs, node configs, and
|
|
||||||
client scripts.
|
|
||||||
|
|
||||||
Tasks:
|
|
||||||
|
|
||||||
- Add provider config entries using environment variables first.
|
|
||||||
- Add external/provider-backed service registration without requiring node
|
|
||||||
heartbeat.
|
|
||||||
- Resolve provider headers centrally in the upstream layer.
|
|
||||||
- Keep provider credential storage optional; encrypted-at-rest credentials can
|
|
||||||
be deferred.
|
|
||||||
|
|
||||||
Acceptance:
|
|
||||||
|
|
||||||
- Provider keys are loaded from environment variables, not committed YAML.
|
|
||||||
- Provider-backed services can be routed like local services.
|
|
||||||
- Local-only deployments do not need provider sections.
|
|
||||||
|
|
||||||
### M7: Anthropic Messages Adapter
|
|
||||||
|
|
||||||
Goal: expose Anthropic models through the existing OpenAI-compatible chat facade.
|
|
||||||
|
|
||||||
Tasks:
|
|
||||||
|
|
||||||
- Add provider protocol dispatch in `UpstreamClient`.
|
|
||||||
- Transform OpenAI-shaped messages into Anthropic Messages requests.
|
|
||||||
- Transform Anthropic responses back to OpenAI-compatible chat completions.
|
|
||||||
- Reject Anthropic streaming clearly until implemented.
|
|
||||||
|
|
||||||
Acceptance:
|
|
||||||
|
|
||||||
- A chat request can route to an Anthropic-backed service.
|
|
||||||
- System messages and usage fields are mapped correctly.
|
|
||||||
- Unsupported streaming fails with a specific error.
|
|
||||||
|
|
||||||
### M8: Budget and Quota Enforcement
|
|
||||||
|
|
||||||
Goal: prevent accidental provider overspend.
|
|
||||||
|
|
||||||
Tasks:
|
|
||||||
|
|
||||||
- Add budget config with disabled default.
|
|
||||||
- Use audit summaries to calculate monthly usage.
|
|
||||||
- Add request, token, and estimated-cost limits per key, provider, and globally.
|
|
||||||
- Add configurable price maps.
|
|
||||||
|
|
||||||
Acceptance:
|
|
||||||
|
|
||||||
- Requests over configured limits are denied before upstream calls.
|
|
||||||
- Unknown-cost behavior is configurable.
|
|
||||||
- Casual deployments do not perform budget checks.
|
|
||||||
|
|
||||||
### M9: Admin CLI and Operations Docs
|
|
||||||
|
|
||||||
Goal: make managed operation scriptable and understandable.
|
|
||||||
|
|
||||||
Tasks:
|
|
||||||
|
|
||||||
- Add `geniehive-admin` CLI for create/list/disable/enable keys and usage
|
|
||||||
summaries.
|
|
||||||
- Add Foundation docs for gateway operation, provider accounts, key management,
|
|
||||||
archive migration workflow, and emergency disable.
|
|
||||||
- Document when provider-native seats are needed instead of GenieHive routing.
|
|
||||||
|
|
||||||
Acceptance:
|
|
||||||
|
|
||||||
- A new operator can provision and revoke a user key without editing SQLite.
|
|
||||||
- A board-facing control summary explains ownership, auditability, and budget
|
|
||||||
control.
|
|
||||||
|
|
||||||
### M10: Security Review
|
|
||||||
|
|
||||||
Goal: make the Foundation profile safe to expose beyond localhost.
|
|
||||||
|
|
||||||
Tasks:
|
|
||||||
|
|
||||||
- Add a security checklist covering provider keys, admin auth, content logging,
|
|
||||||
CORS, TLS/reverse proxy, backup/restore, rate limits, and emergency disable.
|
|
||||||
- Implement critical checklist items or explicitly defer with issue references.
|
|
||||||
- Keep WAN and zero-trust networking as deployment concerns unless a concrete
|
|
||||||
need appears.
|
|
||||||
|
|
||||||
Acceptance:
|
|
||||||
|
|
||||||
- Security checklist exists.
|
|
||||||
- Critical production risks have implementation or documented mitigations.
|
|
||||||
|
|
||||||
## Initial Implementation Order
|
|
||||||
|
|
||||||
1. M0: Baseline and compatibility guard.
|
|
||||||
2. M1: Config profiles and feature flags.
|
|
||||||
3. M2: Named client credentials.
|
|
||||||
4. M3: Request audit log.
|
|
||||||
5. M4: Model and operation authorization.
|
|
||||||
6. M5: Archive migration profile.
|
|
||||||
7. M6: Provider credential indirection.
|
|
||||||
8. M7: Anthropic Messages adapter.
|
|
||||||
9. M8: Budget and quota enforcement.
|
|
||||||
10. M9: Admin CLI and operations docs.
|
|
||||||
11. M10: Security review.
|
|
||||||
|
|
||||||
This order lets local-only and TalkOrigins migration pilots start before paid
|
|
||||||
provider routing and budget controls are complete.
|
|
||||||
|
|
@ -1,25 +1,7 @@
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
import os
|
|
||||||
from dataclasses import dataclass
|
|
||||||
from fnmatch import fnmatchcase
|
|
||||||
|
|
||||||
from fastapi import HTTPException, Request, status
|
from fastapi import HTTPException, Request, status
|
||||||
|
|
||||||
from .keys import hash_api_key
|
|
||||||
|
|
||||||
|
|
||||||
@dataclass(frozen=True)
|
|
||||||
class ClientContext:
|
|
||||||
auth_kind: str
|
|
||||||
key_id: str | None = None
|
|
||||||
display_name: str | None = None
|
|
||||||
principal_type: str | None = None
|
|
||||||
principal_ref: str | None = None
|
|
||||||
role: str | None = None
|
|
||||||
allowed_models: tuple[str, ...] = ()
|
|
||||||
allowed_operations: tuple[str, ...] = ()
|
|
||||||
|
|
||||||
|
|
||||||
def _check_key(request: Request, allowed_keys: list[str], header_name: str) -> None:
|
def _check_key(request: Request, allowed_keys: list[str], header_name: str) -> None:
|
||||||
if not allowed_keys:
|
if not allowed_keys:
|
||||||
|
|
@ -33,137 +15,11 @@ def _check_key(request: Request, allowed_keys: list[str], header_name: str) -> N
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
def _set_client_context(request: Request, context: ClientContext) -> None:
|
def require_client_auth(request: Request) -> None:
|
||||||
request.state.client_context = context
|
|
||||||
|
|
||||||
|
|
||||||
def require_client_auth(request: Request) -> ClientContext:
|
|
||||||
cfg = request.app.state.cfg
|
cfg = request.app.state.cfg
|
||||||
provided = request.headers.get("X-Api-Key")
|
_check_key(request, cfg.auth.client_api_keys, "X-Api-Key")
|
||||||
|
|
||||||
if cfg.auth.client_api_keys and provided in cfg.auth.client_api_keys:
|
|
||||||
context = ClientContext(auth_kind="static")
|
|
||||||
_set_client_context(request, context)
|
|
||||||
return context
|
|
||||||
|
|
||||||
if cfg.auth.enable_named_client_keys:
|
|
||||||
if not provided:
|
|
||||||
raise HTTPException(
|
|
||||||
status_code=status.HTTP_401_UNAUTHORIZED,
|
|
||||||
detail="unauthorized",
|
|
||||||
)
|
|
||||||
secret = os.environ.get(cfg.auth.key_hash_secret_env)
|
|
||||||
if not secret:
|
|
||||||
raise HTTPException(
|
|
||||||
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
|
|
||||||
detail=f"{cfg.auth.key_hash_secret_env} is required for named client keys",
|
|
||||||
)
|
|
||||||
key_hash = hash_api_key(provided, secret=secret)
|
|
||||||
key_row = request.app.state.registry.get_client_key_by_hash(key_hash)
|
|
||||||
if key_row is None or not key_row["enabled"]:
|
|
||||||
raise HTTPException(
|
|
||||||
status_code=status.HTTP_401_UNAUTHORIZED,
|
|
||||||
detail="unauthorized",
|
|
||||||
)
|
|
||||||
request.app.state.registry.touch_client_key(key_row["key_id"])
|
|
||||||
context = ClientContext(
|
|
||||||
auth_kind="named",
|
|
||||||
key_id=key_row["key_id"],
|
|
||||||
display_name=key_row["display_name"],
|
|
||||||
principal_type=key_row["principal_type"],
|
|
||||||
principal_ref=key_row["principal_ref"],
|
|
||||||
role=key_row["role"],
|
|
||||||
allowed_models=tuple(key_row["allowed_models"]),
|
|
||||||
allowed_operations=tuple(key_row["allowed_operations"]),
|
|
||||||
)
|
|
||||||
_set_client_context(request, context)
|
|
||||||
return context
|
|
||||||
|
|
||||||
if cfg.auth.client_api_keys:
|
|
||||||
raise HTTPException(
|
|
||||||
status_code=status.HTTP_401_UNAUTHORIZED,
|
|
||||||
detail="unauthorized",
|
|
||||||
)
|
|
||||||
|
|
||||||
context = ClientContext(auth_kind="development")
|
|
||||||
_set_client_context(request, context)
|
|
||||||
return context
|
|
||||||
|
|
||||||
|
|
||||||
def require_node_auth(request: Request) -> None:
|
def require_node_auth(request: Request) -> None:
|
||||||
cfg = request.app.state.cfg
|
cfg = request.app.state.cfg
|
||||||
_check_key(request, cfg.auth.node_api_keys, "X-GenieHive-Node-Key")
|
_check_key(request, cfg.auth.node_api_keys, "X-GenieHive-Node-Key")
|
||||||
|
|
||||||
|
|
||||||
def require_admin_auth(request: Request) -> ClientContext:
|
|
||||||
cfg = request.app.state.cfg
|
|
||||||
if not cfg.admin_api.enabled:
|
|
||||||
raise HTTPException(
|
|
||||||
status_code=status.HTTP_404_NOT_FOUND,
|
|
||||||
detail="not found",
|
|
||||||
)
|
|
||||||
context = require_client_auth(request)
|
|
||||||
if context.auth_kind == "static" or context.role == "admin":
|
|
||||||
return context
|
|
||||||
raise HTTPException(
|
|
||||||
status_code=status.HTTP_403_FORBIDDEN,
|
|
||||||
detail="admin access required",
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
def authorize_client_request(request: Request, *, operation: str, model: str | None) -> None:
|
|
||||||
cfg = request.app.state.cfg
|
|
||||||
context = getattr(request.state, "client_context", None)
|
|
||||||
if context is None:
|
|
||||||
return
|
|
||||||
# Static and development auth preserve casual-deployment behavior. Foundation
|
|
||||||
# scoped access is enforced for named keys only.
|
|
||||||
if context.auth_kind != "named":
|
|
||||||
return
|
|
||||||
if cfg.authorization.enforce_operation_allowlists:
|
|
||||||
_authorize_value(
|
|
||||||
value=operation,
|
|
||||||
allowed=context.allowed_operations,
|
|
||||||
empty_means_no_access=cfg.authorization.empty_allowlist_means_no_access,
|
|
||||||
denied_detail=f"operation '{operation}' is not allowed for this key",
|
|
||||||
)
|
|
||||||
if cfg.authorization.enforce_model_allowlists:
|
|
||||||
if not model:
|
|
||||||
raise HTTPException(
|
|
||||||
status_code=status.HTTP_403_FORBIDDEN,
|
|
||||||
detail="model is required for model authorization",
|
|
||||||
)
|
|
||||||
_authorize_value(
|
|
||||||
value=model,
|
|
||||||
allowed=context.allowed_models,
|
|
||||||
empty_means_no_access=cfg.authorization.empty_allowlist_means_no_access,
|
|
||||||
denied_detail=f"model '{model}' is not allowed for this key",
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
def _authorize_value(
|
|
||||||
*,
|
|
||||||
value: str,
|
|
||||||
allowed: tuple[str, ...],
|
|
||||||
empty_means_no_access: bool,
|
|
||||||
denied_detail: str,
|
|
||||||
) -> None:
|
|
||||||
if not allowed:
|
|
||||||
if empty_means_no_access:
|
|
||||||
raise HTTPException(
|
|
||||||
status_code=status.HTTP_403_FORBIDDEN,
|
|
||||||
detail=denied_detail,
|
|
||||||
)
|
|
||||||
return
|
|
||||||
if any(_allow_pattern_matches(pattern, value) for pattern in allowed):
|
|
||||||
return
|
|
||||||
raise HTTPException(
|
|
||||||
status_code=status.HTTP_403_FORBIDDEN,
|
|
||||||
detail=denied_detail,
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
def _allow_pattern_matches(pattern: str, value: str) -> bool:
|
|
||||||
if pattern.startswith("role/"):
|
|
||||||
pattern = pattern.removeprefix("role/")
|
|
||||||
return fnmatchcase(value, pattern)
|
|
||||||
|
|
|
||||||
|
|
@ -14,39 +14,6 @@ class ServerConfig(BaseModel):
|
||||||
class AuthConfig(BaseModel):
|
class AuthConfig(BaseModel):
|
||||||
client_api_keys: list[str] = Field(default_factory=list)
|
client_api_keys: list[str] = Field(default_factory=list)
|
||||||
node_api_keys: list[str] = Field(default_factory=list)
|
node_api_keys: list[str] = Field(default_factory=list)
|
||||||
enable_named_client_keys: bool = False
|
|
||||||
key_hash_secret_env: str = "GENIEHIVE_KEY_HASH_SECRET"
|
|
||||||
|
|
||||||
|
|
||||||
class AuditConfig(BaseModel):
|
|
||||||
enabled: bool = False
|
|
||||||
|
|
||||||
|
|
||||||
class AdminApiConfig(BaseModel):
|
|
||||||
enabled: bool = False
|
|
||||||
|
|
||||||
|
|
||||||
class AuthorizationConfig(BaseModel):
|
|
||||||
enforce_model_allowlists: bool = False
|
|
||||||
enforce_operation_allowlists: bool = False
|
|
||||||
empty_allowlist_means_no_access: bool = True
|
|
||||||
|
|
||||||
|
|
||||||
class ProviderConfig(BaseModel):
|
|
||||||
provider_id: str
|
|
||||||
provider_kind: str
|
|
||||||
base_url: str
|
|
||||||
api_key_env: str | None = None
|
|
||||||
default_headers: dict[str, str] = Field(default_factory=dict)
|
|
||||||
enabled: bool = True
|
|
||||||
|
|
||||||
|
|
||||||
class BudgetingConfig(BaseModel):
|
|
||||||
enabled: bool = False
|
|
||||||
reset_day_of_month: int = 1
|
|
||||||
global_monthly_budget_cents: int | None = None
|
|
||||||
provider_monthly_budget_cents: dict[str, int] = Field(default_factory=dict)
|
|
||||||
deny_on_unknown_cost: bool = False
|
|
||||||
|
|
||||||
|
|
||||||
class StorageConfig(BaseModel):
|
class StorageConfig(BaseModel):
|
||||||
|
|
@ -66,14 +33,8 @@ class RoutingConfig(BaseModel):
|
||||||
|
|
||||||
|
|
||||||
class ControlConfig(BaseModel):
|
class ControlConfig(BaseModel):
|
||||||
deployment_profile: str = "casual"
|
|
||||||
server: ServerConfig = Field(default_factory=ServerConfig)
|
server: ServerConfig = Field(default_factory=ServerConfig)
|
||||||
auth: AuthConfig = Field(default_factory=AuthConfig)
|
auth: AuthConfig = Field(default_factory=AuthConfig)
|
||||||
audit: AuditConfig = Field(default_factory=AuditConfig)
|
|
||||||
admin_api: AdminApiConfig = Field(default_factory=AdminApiConfig)
|
|
||||||
authorization: AuthorizationConfig = Field(default_factory=AuthorizationConfig)
|
|
||||||
providers: list[ProviderConfig] = Field(default_factory=list)
|
|
||||||
budgeting: BudgetingConfig = Field(default_factory=BudgetingConfig)
|
|
||||||
storage: StorageConfig = Field(default_factory=StorageConfig)
|
storage: StorageConfig = Field(default_factory=StorageConfig)
|
||||||
routing: RoutingConfig = Field(default_factory=RoutingConfig)
|
routing: RoutingConfig = Field(default_factory=RoutingConfig)
|
||||||
roles_path: str | None = None
|
roles_path: str | None = None
|
||||||
|
|
|
||||||
|
|
@ -1,39 +0,0 @@
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
import hashlib
|
|
||||||
import hmac
|
|
||||||
import secrets
|
|
||||||
|
|
||||||
|
|
||||||
DEFAULT_KEY_PREFIX = "gh"
|
|
||||||
|
|
||||||
|
|
||||||
def generate_api_key(*, prefix: str = DEFAULT_KEY_PREFIX, token_bytes: int = 32) -> str:
|
|
||||||
"""Generate a URL-safe API key. The raw value is only shown once."""
|
|
||||||
token = secrets.token_urlsafe(token_bytes)
|
|
||||||
return f"{prefix}_{token}"
|
|
||||||
|
|
||||||
|
|
||||||
def hash_api_key(api_key: str, *, secret: str) -> str:
|
|
||||||
if not secret:
|
|
||||||
raise ValueError("key hash secret must not be empty")
|
|
||||||
digest = hmac.new(
|
|
||||||
secret.encode("utf-8"),
|
|
||||||
api_key.encode("utf-8"),
|
|
||||||
hashlib.sha256,
|
|
||||||
).hexdigest()
|
|
||||||
return f"hmac-sha256:{digest}"
|
|
||||||
|
|
||||||
|
|
||||||
def verify_api_key(api_key: str, key_hash: str, *, secret: str) -> bool:
|
|
||||||
try:
|
|
||||||
expected = hash_api_key(api_key, secret=secret)
|
|
||||||
except ValueError:
|
|
||||||
return False
|
|
||||||
return hmac.compare_digest(expected, key_hash)
|
|
||||||
|
|
||||||
|
|
||||||
def redact_api_key(api_key: str) -> str:
|
|
||||||
if len(api_key) <= 12:
|
|
||||||
return "***"
|
|
||||||
return f"{api_key[:6]}...{api_key[-4:]}"
|
|
||||||
|
|
@ -1,25 +1,20 @@
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
import asyncio
|
import asyncio
|
||||||
import json
|
|
||||||
import os
|
import os
|
||||||
import time
|
|
||||||
import uuid
|
|
||||||
from contextlib import asynccontextmanager, suppress
|
from contextlib import asynccontextmanager, suppress
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
|
|
||||||
from fastapi import Depends, FastAPI, File, Form, HTTPException, Request, UploadFile, status
|
from fastapi import Depends, FastAPI, File, Form, Request, UploadFile
|
||||||
from fastapi.responses import JSONResponse, StreamingResponse
|
from fastapi.responses import JSONResponse, StreamingResponse
|
||||||
|
|
||||||
from .auth import authorize_client_request, require_admin_auth, require_client_auth, require_node_auth
|
from .auth import require_client_auth, require_node_auth
|
||||||
from .chat import ProxyError, _prepare_chat_upstream, proxy_chat_completion, proxy_embeddings, proxy_transcription, stream_chat_completion
|
from .chat import ProxyError, _prepare_chat_upstream, proxy_chat_completion, proxy_embeddings, proxy_transcription, stream_chat_completion
|
||||||
from .config import ControlConfig, load_config
|
from .config import ControlConfig, load_config
|
||||||
from .keys import generate_api_key, hash_api_key
|
|
||||||
from .models import BenchmarkIngestRequest, HostHeartbeat, HostRegistration, RouteMatchRequest, RouteMatchResponse
|
from .models import BenchmarkIngestRequest, HostHeartbeat, HostRegistration, RouteMatchRequest, RouteMatchResponse
|
||||||
from .probe import ServiceProber
|
from .probe import ServiceProber
|
||||||
from .roles import load_role_catalog
|
from .roles import load_role_catalog
|
||||||
from .registry import Registry
|
from .registry import Registry
|
||||||
from .routing import choose_upstream_model_id
|
|
||||||
from .upstream import UpstreamClient, UpstreamError
|
from .upstream import UpstreamClient, UpstreamError
|
||||||
|
|
||||||
|
|
||||||
|
|
@ -66,184 +61,6 @@ def create_app(
|
||||||
async def health() -> dict[str, str]:
|
async def health() -> dict[str, str]:
|
||||||
return {"status": "ok"}
|
return {"status": "ok"}
|
||||||
|
|
||||||
def _public_client_key(row: dict) -> dict:
|
|
||||||
return {
|
|
||||||
key: value
|
|
||||||
for key, value in row.items()
|
|
||||||
if key != "key_hash"
|
|
||||||
}
|
|
||||||
|
|
||||||
def _request_id(request: Request) -> str:
|
|
||||||
return request.headers.get("X-Request-Id") or f"req_{uuid.uuid4().hex}"
|
|
||||||
|
|
||||||
def _client_context(request: Request):
|
|
||||||
return getattr(request.state, "client_context", None)
|
|
||||||
|
|
||||||
def _route_audit_metadata(reg: Registry, requested_model: str | None, *, kind: str) -> dict:
|
|
||||||
if not requested_model:
|
|
||||||
return {
|
|
||||||
"requested_model": None,
|
|
||||||
"resolved_service_id": None,
|
|
||||||
"resolved_host_id": None,
|
|
||||||
"upstream_model": None,
|
|
||||||
"provider_kind": None,
|
|
||||||
}
|
|
||||||
resolved = reg.resolve_route(requested_model, kind=kind)
|
|
||||||
service = resolved.get("service") if resolved else None
|
|
||||||
if not service:
|
|
||||||
return {
|
|
||||||
"requested_model": requested_model,
|
|
||||||
"resolved_service_id": None,
|
|
||||||
"resolved_host_id": None,
|
|
||||||
"upstream_model": None,
|
|
||||||
"provider_kind": None,
|
|
||||||
}
|
|
||||||
return {
|
|
||||||
"requested_model": requested_model,
|
|
||||||
"resolved_service_id": service.get("service_id"),
|
|
||||||
"resolved_host_id": service.get("host_id"),
|
|
||||||
"upstream_model": choose_upstream_model_id(requested_model, service),
|
|
||||||
"provider_kind": service.get("protocol"),
|
|
||||||
}
|
|
||||||
|
|
||||||
def _usage_from_response(response: object) -> dict[str, int | None]:
|
|
||||||
usage = response.get("usage", {}) if isinstance(response, dict) else {}
|
|
||||||
return {
|
|
||||||
"prompt_tokens": usage.get("prompt_tokens") if isinstance(usage, dict) else None,
|
|
||||||
"completion_tokens": usage.get("completion_tokens") if isinstance(usage, dict) else None,
|
|
||||||
"total_tokens": usage.get("total_tokens") if isinstance(usage, dict) else None,
|
|
||||||
}
|
|
||||||
|
|
||||||
def _audit_request(
|
|
||||||
request: Request,
|
|
||||||
*,
|
|
||||||
request_id: str,
|
|
||||||
operation: str,
|
|
||||||
route_metadata: dict,
|
|
||||||
started_at: float,
|
|
||||||
status_code: int,
|
|
||||||
success: bool,
|
|
||||||
response: object | None = None,
|
|
||||||
error_type: str | None = None,
|
|
||||||
input_bytes: int | None = None,
|
|
||||||
output_bytes: int | None = None,
|
|
||||||
) -> None:
|
|
||||||
if not cfg.audit.enabled:
|
|
||||||
return
|
|
||||||
context = _client_context(request)
|
|
||||||
usage = _usage_from_response(response)
|
|
||||||
request.app.state.registry.record_request_audit(
|
|
||||||
request_id=request_id,
|
|
||||||
key_id=getattr(context, "key_id", None),
|
|
||||||
principal_type=getattr(context, "principal_type", None),
|
|
||||||
principal_ref=getattr(context, "principal_ref", None),
|
|
||||||
operation=operation,
|
|
||||||
requested_model=route_metadata.get("requested_model"),
|
|
||||||
resolved_service_id=route_metadata.get("resolved_service_id"),
|
|
||||||
resolved_host_id=route_metadata.get("resolved_host_id"),
|
|
||||||
upstream_model=route_metadata.get("upstream_model"),
|
|
||||||
provider_kind=route_metadata.get("provider_kind"),
|
|
||||||
started_at=started_at,
|
|
||||||
finished_at=time.time(),
|
|
||||||
status_code=status_code,
|
|
||||||
success=success,
|
|
||||||
error_type=error_type,
|
|
||||||
input_bytes=input_bytes,
|
|
||||||
output_bytes=output_bytes,
|
|
||||||
**usage,
|
|
||||||
)
|
|
||||||
|
|
||||||
if cfg.admin_api.enabled:
|
|
||||||
@app.post("/v1/admin/client-keys")
|
|
||||||
async def create_client_key(request: Request, _=Depends(require_admin_auth)) -> dict:
|
|
||||||
if not cfg.auth.enable_named_client_keys:
|
|
||||||
raise HTTPException(
|
|
||||||
status_code=status.HTTP_400_BAD_REQUEST,
|
|
||||||
detail="named client keys are not enabled",
|
|
||||||
)
|
|
||||||
secret = os.environ.get(cfg.auth.key_hash_secret_env)
|
|
||||||
if not secret:
|
|
||||||
raise HTTPException(
|
|
||||||
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
|
|
||||||
detail=f"{cfg.auth.key_hash_secret_env} is required for named client keys",
|
|
||||||
)
|
|
||||||
payload = await request.json()
|
|
||||||
raw_key = generate_api_key()
|
|
||||||
key_id = payload.get("key_id") or f"ck_{uuid.uuid4().hex}"
|
|
||||||
created = request.app.state.registry.create_client_key(
|
|
||||||
key_id=key_id,
|
|
||||||
key_hash=hash_api_key(raw_key, secret=secret),
|
|
||||||
display_name=payload["display_name"],
|
|
||||||
principal_type=payload["principal_type"],
|
|
||||||
principal_ref=payload["principal_ref"],
|
|
||||||
role=payload.get("role"),
|
|
||||||
allowed_models=payload.get("allowed_models") or [],
|
|
||||||
allowed_operations=payload.get("allowed_operations") or [],
|
|
||||||
monthly_budget_cents=payload.get("monthly_budget_cents"),
|
|
||||||
monthly_token_limit=payload.get("monthly_token_limit"),
|
|
||||||
enabled=payload.get("enabled", True),
|
|
||||||
notes=payload.get("notes"),
|
|
||||||
)
|
|
||||||
return {
|
|
||||||
"status": "ok",
|
|
||||||
"api_key": raw_key,
|
|
||||||
"client_key": _public_client_key(created),
|
|
||||||
}
|
|
||||||
|
|
||||||
@app.get("/v1/admin/client-keys")
|
|
||||||
async def list_client_keys(request: Request, _=Depends(require_admin_auth)) -> dict:
|
|
||||||
rows = request.app.state.registry.list_client_keys()
|
|
||||||
return {"object": "list", "data": [_public_client_key(row) for row in rows]}
|
|
||||||
|
|
||||||
@app.post("/v1/admin/client-keys/{key_id}/disable")
|
|
||||||
async def disable_client_key(key_id: str, request: Request, _=Depends(require_admin_auth)) -> dict:
|
|
||||||
updated = request.app.state.registry.set_client_key_enabled(key_id, False)
|
|
||||||
if updated is None:
|
|
||||||
return JSONResponse(status_code=404, content={"error": "unknown_client_key", "key_id": key_id})
|
|
||||||
return {"status": "ok", "client_key": _public_client_key(updated)}
|
|
||||||
|
|
||||||
@app.post("/v1/admin/client-keys/{key_id}/enable")
|
|
||||||
async def enable_client_key(key_id: str, request: Request, _=Depends(require_admin_auth)) -> dict:
|
|
||||||
updated = request.app.state.registry.set_client_key_enabled(key_id, True)
|
|
||||||
if updated is None:
|
|
||||||
return JSONResponse(status_code=404, content={"error": "unknown_client_key", "key_id": key_id})
|
|
||||||
return {"status": "ok", "client_key": _public_client_key(updated)}
|
|
||||||
|
|
||||||
@app.get("/v1/admin/audit/requests")
|
|
||||||
async def list_audit_requests(
|
|
||||||
request: Request,
|
|
||||||
key_id: str | None = None,
|
|
||||||
principal_ref: str | None = None,
|
|
||||||
operation: str | None = None,
|
|
||||||
model: str | None = None,
|
|
||||||
success: bool | None = None,
|
|
||||||
limit: int = 100,
|
|
||||||
_=Depends(require_admin_auth),
|
|
||||||
) -> dict:
|
|
||||||
if not cfg.audit.enabled:
|
|
||||||
raise HTTPException(
|
|
||||||
status_code=status.HTTP_400_BAD_REQUEST,
|
|
||||||
detail="audit logging is not enabled",
|
|
||||||
)
|
|
||||||
rows = request.app.state.registry.list_request_audit(
|
|
||||||
key_id=key_id,
|
|
||||||
principal_ref=principal_ref,
|
|
||||||
operation=operation,
|
|
||||||
model=model,
|
|
||||||
success=success,
|
|
||||||
limit=limit,
|
|
||||||
)
|
|
||||||
return {"object": "list", "data": rows}
|
|
||||||
|
|
||||||
@app.get("/v1/admin/audit/summary")
|
|
||||||
async def audit_summary(request: Request, _=Depends(require_admin_auth)) -> dict:
|
|
||||||
if not cfg.audit.enabled:
|
|
||||||
raise HTTPException(
|
|
||||||
status_code=status.HTTP_400_BAD_REQUEST,
|
|
||||||
detail="audit logging is not enabled",
|
|
||||||
)
|
|
||||||
return {"object": "list", "data": request.app.state.registry.request_audit_summary()}
|
|
||||||
|
|
||||||
@app.post("/v1/nodes/register")
|
@app.post("/v1/nodes/register")
|
||||||
async def register_node(request: Request, _=Depends(require_node_auth)) -> dict:
|
async def register_node(request: Request, _=Depends(require_node_auth)) -> dict:
|
||||||
payload = await request.json()
|
payload = await request.json()
|
||||||
|
|
@ -273,178 +90,45 @@ def create_app(
|
||||||
body = await request.json()
|
body = await request.json()
|
||||||
reg: Registry = request.app.state.registry
|
reg: Registry = request.app.state.registry
|
||||||
up: UpstreamClient = request.app.state.upstream
|
up: UpstreamClient = request.app.state.upstream
|
||||||
request_id = _request_id(request)
|
|
||||||
started_at = time.time()
|
|
||||||
route_metadata = _route_audit_metadata(reg, body.get("model"), kind="chat")
|
|
||||||
input_bytes = len(json.dumps(body, separators=(",", ":")).encode("utf-8"))
|
|
||||||
try:
|
try:
|
||||||
authorize_client_request(request, operation="chat", model=body.get("model"))
|
|
||||||
if body.get("stream"):
|
if body.get("stream"):
|
||||||
# Resolve route eagerly so ProxyError is raised before streaming starts.
|
# Resolve route eagerly so ProxyError is raised before streaming starts.
|
||||||
service, upstream_body = _prepare_chat_upstream(body, registry=reg)
|
service, upstream_body = _prepare_chat_upstream(body, registry=reg)
|
||||||
_audit_request(
|
|
||||||
request,
|
|
||||||
request_id=request_id,
|
|
||||||
operation="chat",
|
|
||||||
route_metadata=route_metadata,
|
|
||||||
started_at=started_at,
|
|
||||||
status_code=200,
|
|
||||||
success=True,
|
|
||||||
input_bytes=input_bytes,
|
|
||||||
)
|
|
||||||
return StreamingResponse(
|
return StreamingResponse(
|
||||||
stream_chat_completion(service, upstream_body, upstream=up),
|
stream_chat_completion(service, upstream_body, upstream=up),
|
||||||
media_type="text/event-stream",
|
media_type="text/event-stream",
|
||||||
headers={"Cache-Control": "no-cache", "X-Accel-Buffering": "no", "X-Request-Id": request_id},
|
headers={"Cache-Control": "no-cache", "X-Accel-Buffering": "no"},
|
||||||
)
|
)
|
||||||
response = await proxy_chat_completion(body, registry=reg, upstream=up)
|
return await proxy_chat_completion(body, registry=reg, upstream=up)
|
||||||
output_bytes = len(json.dumps(response, separators=(",", ":")).encode("utf-8")) if isinstance(response, dict) else None
|
|
||||||
_audit_request(
|
|
||||||
request,
|
|
||||||
request_id=request_id,
|
|
||||||
operation="chat",
|
|
||||||
route_metadata=route_metadata,
|
|
||||||
started_at=started_at,
|
|
||||||
status_code=200,
|
|
||||||
success=True,
|
|
||||||
response=response,
|
|
||||||
input_bytes=input_bytes,
|
|
||||||
output_bytes=output_bytes,
|
|
||||||
)
|
|
||||||
return JSONResponse(content=response, headers={"X-Request-Id": request_id})
|
|
||||||
except ProxyError as exc:
|
except ProxyError as exc:
|
||||||
_audit_request(
|
|
||||||
request,
|
|
||||||
request_id=request_id,
|
|
||||||
operation="chat",
|
|
||||||
route_metadata=route_metadata,
|
|
||||||
started_at=started_at,
|
|
||||||
status_code=exc.status_code,
|
|
||||||
success=False,
|
|
||||||
error_type="proxy_error",
|
|
||||||
input_bytes=input_bytes,
|
|
||||||
)
|
|
||||||
return JSONResponse(
|
return JSONResponse(
|
||||||
status_code=exc.status_code,
|
status_code=exc.status_code,
|
||||||
content={"error": {"message": str(exc), "type": "geniehive_error", "code": "chat_proxy_error"}},
|
content={"error": {"message": str(exc), "type": "geniehive_error", "code": "chat_proxy_error"}},
|
||||||
headers={"X-Request-Id": request_id},
|
|
||||||
)
|
|
||||||
except HTTPException as exc:
|
|
||||||
_audit_request(
|
|
||||||
request,
|
|
||||||
request_id=request_id,
|
|
||||||
operation="chat",
|
|
||||||
route_metadata=route_metadata,
|
|
||||||
started_at=started_at,
|
|
||||||
status_code=exc.status_code,
|
|
||||||
success=False,
|
|
||||||
error_type="authorization_error",
|
|
||||||
input_bytes=input_bytes,
|
|
||||||
)
|
|
||||||
return JSONResponse(
|
|
||||||
status_code=exc.status_code,
|
|
||||||
content={"error": {"message": str(exc.detail), "type": "geniehive_error", "code": "authorization_error"}},
|
|
||||||
headers={"X-Request-Id": request_id},
|
|
||||||
)
|
)
|
||||||
except UpstreamError as exc:
|
except UpstreamError as exc:
|
||||||
status_code = exc.status_code or 502
|
|
||||||
_audit_request(
|
|
||||||
request,
|
|
||||||
request_id=request_id,
|
|
||||||
operation="chat",
|
|
||||||
route_metadata=route_metadata,
|
|
||||||
started_at=started_at,
|
|
||||||
status_code=status_code,
|
|
||||||
success=False,
|
|
||||||
error_type="upstream_error",
|
|
||||||
input_bytes=input_bytes,
|
|
||||||
)
|
|
||||||
return JSONResponse(
|
return JSONResponse(
|
||||||
status_code=status_code,
|
status_code=exc.status_code or 502,
|
||||||
content={"error": {"message": str(exc), "type": "geniehive_error", "code": "upstream_error"}},
|
content={"error": {"message": str(exc), "type": "geniehive_error", "code": "upstream_error"}},
|
||||||
headers={"X-Request-Id": request_id},
|
|
||||||
)
|
)
|
||||||
|
|
||||||
@app.post("/v1/embeddings")
|
@app.post("/v1/embeddings")
|
||||||
async def embeddings(request: Request, _=Depends(require_client_auth)):
|
async def embeddings(request: Request, _=Depends(require_client_auth)):
|
||||||
body = await request.json()
|
body = await request.json()
|
||||||
reg: Registry = request.app.state.registry
|
|
||||||
request_id = _request_id(request)
|
|
||||||
started_at = time.time()
|
|
||||||
route_metadata = _route_audit_metadata(reg, body.get("model"), kind="embeddings")
|
|
||||||
input_bytes = len(json.dumps(body, separators=(",", ":")).encode("utf-8"))
|
|
||||||
try:
|
try:
|
||||||
authorize_client_request(request, operation="embeddings", model=body.get("model"))
|
return await proxy_embeddings(
|
||||||
response = await proxy_embeddings(
|
|
||||||
body,
|
body,
|
||||||
registry=reg,
|
registry=request.app.state.registry,
|
||||||
upstream=request.app.state.upstream,
|
upstream=request.app.state.upstream,
|
||||||
)
|
)
|
||||||
output_bytes = len(json.dumps(response, separators=(",", ":")).encode("utf-8")) if isinstance(response, dict) else None
|
|
||||||
_audit_request(
|
|
||||||
request,
|
|
||||||
request_id=request_id,
|
|
||||||
operation="embeddings",
|
|
||||||
route_metadata=route_metadata,
|
|
||||||
started_at=started_at,
|
|
||||||
status_code=200,
|
|
||||||
success=True,
|
|
||||||
response=response,
|
|
||||||
input_bytes=input_bytes,
|
|
||||||
output_bytes=output_bytes,
|
|
||||||
)
|
|
||||||
return JSONResponse(content=response, headers={"X-Request-Id": request_id})
|
|
||||||
except ProxyError as exc:
|
except ProxyError as exc:
|
||||||
_audit_request(
|
|
||||||
request,
|
|
||||||
request_id=request_id,
|
|
||||||
operation="embeddings",
|
|
||||||
route_metadata=route_metadata,
|
|
||||||
started_at=started_at,
|
|
||||||
status_code=exc.status_code,
|
|
||||||
success=False,
|
|
||||||
error_type="proxy_error",
|
|
||||||
input_bytes=input_bytes,
|
|
||||||
)
|
|
||||||
return JSONResponse(
|
return JSONResponse(
|
||||||
status_code=exc.status_code,
|
status_code=exc.status_code,
|
||||||
content={"error": {"message": str(exc), "type": "geniehive_error", "code": "embeddings_proxy_error"}},
|
content={"error": {"message": str(exc), "type": "geniehive_error", "code": "embeddings_proxy_error"}},
|
||||||
headers={"X-Request-Id": request_id},
|
|
||||||
)
|
|
||||||
except HTTPException as exc:
|
|
||||||
_audit_request(
|
|
||||||
request,
|
|
||||||
request_id=request_id,
|
|
||||||
operation="embeddings",
|
|
||||||
route_metadata=route_metadata,
|
|
||||||
started_at=started_at,
|
|
||||||
status_code=exc.status_code,
|
|
||||||
success=False,
|
|
||||||
error_type="authorization_error",
|
|
||||||
input_bytes=input_bytes,
|
|
||||||
)
|
|
||||||
return JSONResponse(
|
|
||||||
status_code=exc.status_code,
|
|
||||||
content={"error": {"message": str(exc.detail), "type": "geniehive_error", "code": "authorization_error"}},
|
|
||||||
headers={"X-Request-Id": request_id},
|
|
||||||
)
|
)
|
||||||
except UpstreamError as exc:
|
except UpstreamError as exc:
|
||||||
status_code = exc.status_code or 502
|
|
||||||
_audit_request(
|
|
||||||
request,
|
|
||||||
request_id=request_id,
|
|
||||||
operation="embeddings",
|
|
||||||
route_metadata=route_metadata,
|
|
||||||
started_at=started_at,
|
|
||||||
status_code=status_code,
|
|
||||||
success=False,
|
|
||||||
error_type="upstream_error",
|
|
||||||
input_bytes=input_bytes,
|
|
||||||
)
|
|
||||||
return JSONResponse(
|
return JSONResponse(
|
||||||
status_code=status_code,
|
status_code=exc.status_code or 502,
|
||||||
content={"error": {"message": str(exc), "type": "geniehive_error", "code": "upstream_error"}},
|
content={"error": {"message": str(exc), "type": "geniehive_error", "code": "upstream_error"}},
|
||||||
headers={"X-Request-Id": request_id},
|
|
||||||
)
|
)
|
||||||
|
|
||||||
@app.post("/v1/audio/transcriptions")
|
@app.post("/v1/audio/transcriptions")
|
||||||
|
|
@ -458,12 +142,8 @@ def create_app(
|
||||||
temperature: float | None = Form(None),
|
temperature: float | None = Form(None),
|
||||||
_=Depends(require_client_auth),
|
_=Depends(require_client_auth),
|
||||||
):
|
):
|
||||||
request_id = _request_id(request)
|
|
||||||
started_at = time.time()
|
|
||||||
route_metadata = _route_audit_metadata(request.app.state.registry, model, kind="transcription")
|
|
||||||
try:
|
try:
|
||||||
authorize_client_request(request, operation="transcription", model=model)
|
return await proxy_transcription(
|
||||||
response = await proxy_transcription(
|
|
||||||
model=model,
|
model=model,
|
||||||
file=file,
|
file=file,
|
||||||
language=language,
|
language=language,
|
||||||
|
|
@ -473,67 +153,15 @@ def create_app(
|
||||||
registry=request.app.state.registry,
|
registry=request.app.state.registry,
|
||||||
upstream=request.app.state.upstream,
|
upstream=request.app.state.upstream,
|
||||||
)
|
)
|
||||||
output_bytes = len(json.dumps(response, separators=(",", ":")).encode("utf-8")) if isinstance(response, dict) else None
|
|
||||||
_audit_request(
|
|
||||||
request,
|
|
||||||
request_id=request_id,
|
|
||||||
operation="transcription",
|
|
||||||
route_metadata=route_metadata,
|
|
||||||
started_at=started_at,
|
|
||||||
status_code=200,
|
|
||||||
success=True,
|
|
||||||
response=response,
|
|
||||||
output_bytes=output_bytes,
|
|
||||||
)
|
|
||||||
return JSONResponse(content=response, headers={"X-Request-Id": request_id})
|
|
||||||
except ProxyError as exc:
|
except ProxyError as exc:
|
||||||
_audit_request(
|
|
||||||
request,
|
|
||||||
request_id=request_id,
|
|
||||||
operation="transcription",
|
|
||||||
route_metadata=route_metadata,
|
|
||||||
started_at=started_at,
|
|
||||||
status_code=exc.status_code,
|
|
||||||
success=False,
|
|
||||||
error_type="proxy_error",
|
|
||||||
)
|
|
||||||
return JSONResponse(
|
return JSONResponse(
|
||||||
status_code=exc.status_code,
|
status_code=exc.status_code,
|
||||||
content={"error": {"message": str(exc), "type": "geniehive_error", "code": "transcription_proxy_error"}},
|
content={"error": {"message": str(exc), "type": "geniehive_error", "code": "transcription_proxy_error"}},
|
||||||
headers={"X-Request-Id": request_id},
|
|
||||||
)
|
|
||||||
except HTTPException as exc:
|
|
||||||
_audit_request(
|
|
||||||
request,
|
|
||||||
request_id=request_id,
|
|
||||||
operation="transcription",
|
|
||||||
route_metadata=route_metadata,
|
|
||||||
started_at=started_at,
|
|
||||||
status_code=exc.status_code,
|
|
||||||
success=False,
|
|
||||||
error_type="authorization_error",
|
|
||||||
)
|
|
||||||
return JSONResponse(
|
|
||||||
status_code=exc.status_code,
|
|
||||||
content={"error": {"message": str(exc.detail), "type": "geniehive_error", "code": "authorization_error"}},
|
|
||||||
headers={"X-Request-Id": request_id},
|
|
||||||
)
|
)
|
||||||
except UpstreamError as exc:
|
except UpstreamError as exc:
|
||||||
status_code = exc.status_code or 502
|
|
||||||
_audit_request(
|
|
||||||
request,
|
|
||||||
request_id=request_id,
|
|
||||||
operation="transcription",
|
|
||||||
route_metadata=route_metadata,
|
|
||||||
started_at=started_at,
|
|
||||||
status_code=status_code,
|
|
||||||
success=False,
|
|
||||||
error_type="upstream_error",
|
|
||||||
)
|
|
||||||
return JSONResponse(
|
return JSONResponse(
|
||||||
status_code=status_code,
|
status_code=exc.status_code or 502,
|
||||||
content={"error": {"message": str(exc), "type": "geniehive_error", "code": "upstream_error"}},
|
content={"error": {"message": str(exc), "type": "geniehive_error", "code": "upstream_error"}},
|
||||||
headers={"X-Request-Id": request_id},
|
|
||||||
)
|
)
|
||||||
|
|
||||||
@app.get("/v1/cluster/services")
|
@app.get("/v1/cluster/services")
|
||||||
|
|
|
||||||
|
|
@ -77,50 +77,6 @@ class Registry:
|
||||||
observed_at REAL NOT NULL,
|
observed_at REAL NOT NULL,
|
||||||
results_json TEXT NOT NULL
|
results_json TEXT NOT NULL
|
||||||
);
|
);
|
||||||
|
|
||||||
CREATE TABLE IF NOT EXISTS client_keys (
|
|
||||||
key_id TEXT PRIMARY KEY,
|
|
||||||
key_hash TEXT NOT NULL UNIQUE,
|
|
||||||
display_name TEXT NOT NULL,
|
|
||||||
principal_type TEXT NOT NULL,
|
|
||||||
principal_ref TEXT NOT NULL,
|
|
||||||
role TEXT,
|
|
||||||
allowed_models_json TEXT NOT NULL DEFAULT '[]',
|
|
||||||
allowed_operations_json TEXT NOT NULL DEFAULT '[]',
|
|
||||||
monthly_budget_cents INTEGER,
|
|
||||||
monthly_token_limit INTEGER,
|
|
||||||
enabled INTEGER NOT NULL DEFAULT 1,
|
|
||||||
created_at REAL NOT NULL,
|
|
||||||
updated_at REAL NOT NULL,
|
|
||||||
last_used_at REAL,
|
|
||||||
notes TEXT
|
|
||||||
);
|
|
||||||
|
|
||||||
CREATE TABLE IF NOT EXISTS request_audit_log (
|
|
||||||
request_id TEXT PRIMARY KEY,
|
|
||||||
key_id TEXT,
|
|
||||||
principal_type TEXT,
|
|
||||||
principal_ref TEXT,
|
|
||||||
operation TEXT NOT NULL,
|
|
||||||
requested_model TEXT,
|
|
||||||
resolved_service_id TEXT,
|
|
||||||
resolved_host_id TEXT,
|
|
||||||
upstream_model TEXT,
|
|
||||||
provider_kind TEXT,
|
|
||||||
started_at REAL NOT NULL,
|
|
||||||
finished_at REAL NOT NULL,
|
|
||||||
duration_ms REAL NOT NULL,
|
|
||||||
status_code INTEGER NOT NULL,
|
|
||||||
success INTEGER NOT NULL,
|
|
||||||
error_type TEXT,
|
|
||||||
prompt_tokens INTEGER,
|
|
||||||
completion_tokens INTEGER,
|
|
||||||
total_tokens INTEGER,
|
|
||||||
estimated_cost_cents REAL,
|
|
||||||
input_bytes INTEGER,
|
|
||||||
output_bytes INTEGER,
|
|
||||||
metadata_json TEXT NOT NULL DEFAULT '{}'
|
|
||||||
);
|
|
||||||
"""
|
"""
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
@ -334,227 +290,6 @@ class Registry:
|
||||||
rows = conn.execute(query, params).fetchall()
|
rows = conn.execute(query, params).fetchall()
|
||||||
return [self._benchmark_row_to_dict(row) for row in rows]
|
return [self._benchmark_row_to_dict(row) for row in rows]
|
||||||
|
|
||||||
def create_client_key(
|
|
||||||
self,
|
|
||||||
*,
|
|
||||||
key_id: str,
|
|
||||||
key_hash: str,
|
|
||||||
display_name: str,
|
|
||||||
principal_type: str,
|
|
||||||
principal_ref: str,
|
|
||||||
role: str | None = None,
|
|
||||||
allowed_models: list[str] | None = None,
|
|
||||||
allowed_operations: list[str] | None = None,
|
|
||||||
monthly_budget_cents: int | None = None,
|
|
||||||
monthly_token_limit: int | None = None,
|
|
||||||
enabled: bool = True,
|
|
||||||
notes: str | None = None,
|
|
||||||
) -> dict:
|
|
||||||
now = time.time()
|
|
||||||
with self._connect() as conn:
|
|
||||||
conn.execute(
|
|
||||||
"""
|
|
||||||
INSERT INTO client_keys (
|
|
||||||
key_id, key_hash, display_name, principal_type, principal_ref,
|
|
||||||
role, allowed_models_json, allowed_operations_json,
|
|
||||||
monthly_budget_cents, monthly_token_limit, enabled,
|
|
||||||
created_at, updated_at, last_used_at, notes
|
|
||||||
)
|
|
||||||
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, NULL, ?)
|
|
||||||
""",
|
|
||||||
(
|
|
||||||
key_id,
|
|
||||||
key_hash,
|
|
||||||
display_name,
|
|
||||||
principal_type,
|
|
||||||
principal_ref,
|
|
||||||
role,
|
|
||||||
_json_dumps(allowed_models or []),
|
|
||||||
_json_dumps(allowed_operations or []),
|
|
||||||
monthly_budget_cents,
|
|
||||||
monthly_token_limit,
|
|
||||||
1 if enabled else 0,
|
|
||||||
now,
|
|
||||||
now,
|
|
||||||
notes,
|
|
||||||
),
|
|
||||||
)
|
|
||||||
created = self.get_client_key(key_id)
|
|
||||||
if created is None:
|
|
||||||
raise RuntimeError(f"created client key {key_id!r} could not be loaded")
|
|
||||||
return created
|
|
||||||
|
|
||||||
def get_client_key(self, key_id: str) -> dict | None:
|
|
||||||
with self._connect() as conn:
|
|
||||||
row = conn.execute("SELECT * FROM client_keys WHERE key_id = ?", (key_id,)).fetchone()
|
|
||||||
return self._client_key_row_to_dict(row) if row is not None else None
|
|
||||||
|
|
||||||
def get_client_key_by_hash(self, key_hash: str) -> dict | None:
|
|
||||||
with self._connect() as conn:
|
|
||||||
row = conn.execute("SELECT * FROM client_keys WHERE key_hash = ?", (key_hash,)).fetchone()
|
|
||||||
return self._client_key_row_to_dict(row) if row is not None else None
|
|
||||||
|
|
||||||
def list_client_keys(self) -> list[dict]:
|
|
||||||
with self._connect() as conn:
|
|
||||||
rows = conn.execute("SELECT * FROM client_keys ORDER BY created_at, key_id").fetchall()
|
|
||||||
return [self._client_key_row_to_dict(row) for row in rows]
|
|
||||||
|
|
||||||
def set_client_key_enabled(self, key_id: str, enabled: bool) -> dict | None:
|
|
||||||
now = time.time()
|
|
||||||
with self._connect() as conn:
|
|
||||||
conn.execute(
|
|
||||||
"UPDATE client_keys SET enabled = ?, updated_at = ? WHERE key_id = ?",
|
|
||||||
(1 if enabled else 0, now, key_id),
|
|
||||||
)
|
|
||||||
return self.get_client_key(key_id)
|
|
||||||
|
|
||||||
def touch_client_key(self, key_id: str) -> None:
|
|
||||||
now = time.time()
|
|
||||||
with self._connect() as conn:
|
|
||||||
conn.execute(
|
|
||||||
"UPDATE client_keys SET last_used_at = ?, updated_at = ? WHERE key_id = ?",
|
|
||||||
(now, now, key_id),
|
|
||||||
)
|
|
||||||
|
|
||||||
def record_request_audit(
|
|
||||||
self,
|
|
||||||
*,
|
|
||||||
request_id: str,
|
|
||||||
key_id: str | None,
|
|
||||||
principal_type: str | None,
|
|
||||||
principal_ref: str | None,
|
|
||||||
operation: str,
|
|
||||||
requested_model: str | None,
|
|
||||||
resolved_service_id: str | None,
|
|
||||||
resolved_host_id: str | None,
|
|
||||||
upstream_model: str | None,
|
|
||||||
provider_kind: str | None,
|
|
||||||
started_at: float,
|
|
||||||
finished_at: float,
|
|
||||||
status_code: int,
|
|
||||||
success: bool,
|
|
||||||
error_type: str | None = None,
|
|
||||||
prompt_tokens: int | None = None,
|
|
||||||
completion_tokens: int | None = None,
|
|
||||||
total_tokens: int | None = None,
|
|
||||||
estimated_cost_cents: float | None = None,
|
|
||||||
input_bytes: int | None = None,
|
|
||||||
output_bytes: int | None = None,
|
|
||||||
metadata: dict | None = None,
|
|
||||||
) -> dict:
|
|
||||||
duration_ms = max(0.0, (finished_at - started_at) * 1000.0)
|
|
||||||
with self._connect() as conn:
|
|
||||||
conn.execute(
|
|
||||||
"""
|
|
||||||
INSERT INTO request_audit_log (
|
|
||||||
request_id, key_id, principal_type, principal_ref,
|
|
||||||
operation, requested_model, resolved_service_id,
|
|
||||||
resolved_host_id, upstream_model, provider_kind,
|
|
||||||
started_at, finished_at, duration_ms, status_code, success,
|
|
||||||
error_type, prompt_tokens, completion_tokens, total_tokens,
|
|
||||||
estimated_cost_cents, input_bytes, output_bytes,
|
|
||||||
metadata_json
|
|
||||||
)
|
|
||||||
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
|
|
||||||
""",
|
|
||||||
(
|
|
||||||
request_id,
|
|
||||||
key_id,
|
|
||||||
principal_type,
|
|
||||||
principal_ref,
|
|
||||||
operation,
|
|
||||||
requested_model,
|
|
||||||
resolved_service_id,
|
|
||||||
resolved_host_id,
|
|
||||||
upstream_model,
|
|
||||||
provider_kind,
|
|
||||||
started_at,
|
|
||||||
finished_at,
|
|
||||||
duration_ms,
|
|
||||||
status_code,
|
|
||||||
1 if success else 0,
|
|
||||||
error_type,
|
|
||||||
prompt_tokens,
|
|
||||||
completion_tokens,
|
|
||||||
total_tokens,
|
|
||||||
estimated_cost_cents,
|
|
||||||
input_bytes,
|
|
||||||
output_bytes,
|
|
||||||
_json_dumps(metadata or {}),
|
|
||||||
),
|
|
||||||
)
|
|
||||||
row = self.get_request_audit(request_id)
|
|
||||||
if row is None:
|
|
||||||
raise RuntimeError(f"created audit row {request_id!r} could not be loaded")
|
|
||||||
return row
|
|
||||||
|
|
||||||
def get_request_audit(self, request_id: str) -> dict | None:
|
|
||||||
with self._connect() as conn:
|
|
||||||
row = conn.execute(
|
|
||||||
"SELECT * FROM request_audit_log WHERE request_id = ?",
|
|
||||||
(request_id,),
|
|
||||||
).fetchone()
|
|
||||||
return self._request_audit_row_to_dict(row) if row is not None else None
|
|
||||||
|
|
||||||
def list_request_audit(
|
|
||||||
self,
|
|
||||||
*,
|
|
||||||
key_id: str | None = None,
|
|
||||||
principal_ref: str | None = None,
|
|
||||||
operation: str | None = None,
|
|
||||||
model: str | None = None,
|
|
||||||
success: bool | None = None,
|
|
||||||
limit: int = 100,
|
|
||||||
) -> list[dict]:
|
|
||||||
query = "SELECT * FROM request_audit_log"
|
|
||||||
clauses = []
|
|
||||||
params: list[object] = []
|
|
||||||
if key_id:
|
|
||||||
clauses.append("key_id = ?")
|
|
||||||
params.append(key_id)
|
|
||||||
if principal_ref:
|
|
||||||
clauses.append("principal_ref = ?")
|
|
||||||
params.append(principal_ref)
|
|
||||||
if operation:
|
|
||||||
clauses.append("operation = ?")
|
|
||||||
params.append(operation)
|
|
||||||
if model:
|
|
||||||
clauses.append("requested_model = ?")
|
|
||||||
params.append(model)
|
|
||||||
if success is not None:
|
|
||||||
clauses.append("success = ?")
|
|
||||||
params.append(1 if success else 0)
|
|
||||||
if clauses:
|
|
||||||
query += " WHERE " + " AND ".join(clauses)
|
|
||||||
query += " ORDER BY started_at DESC LIMIT ?"
|
|
||||||
params.append(max(1, min(limit, 1000)))
|
|
||||||
with self._connect() as conn:
|
|
||||||
rows = conn.execute(query, params).fetchall()
|
|
||||||
return [self._request_audit_row_to_dict(row) for row in rows]
|
|
||||||
|
|
||||||
def request_audit_summary(self) -> list[dict]:
|
|
||||||
with self._connect() as conn:
|
|
||||||
rows = conn.execute(
|
|
||||||
"""
|
|
||||||
SELECT
|
|
||||||
key_id,
|
|
||||||
principal_ref,
|
|
||||||
operation,
|
|
||||||
requested_model,
|
|
||||||
COUNT(*) AS request_count,
|
|
||||||
SUM(success) AS success_count,
|
|
||||||
SUM(CASE WHEN success = 0 THEN 1 ELSE 0 END) AS failure_count,
|
|
||||||
SUM(COALESCE(prompt_tokens, 0)) AS prompt_tokens,
|
|
||||||
SUM(COALESCE(completion_tokens, 0)) AS completion_tokens,
|
|
||||||
SUM(COALESCE(total_tokens, 0)) AS total_tokens,
|
|
||||||
SUM(COALESCE(estimated_cost_cents, 0)) AS estimated_cost_cents
|
|
||||||
FROM request_audit_log
|
|
||||||
GROUP BY key_id, principal_ref, operation, requested_model
|
|
||||||
ORDER BY request_count DESC, requested_model
|
|
||||||
"""
|
|
||||||
).fetchall()
|
|
||||||
return [dict(row) for row in rows]
|
|
||||||
|
|
||||||
def list_client_models(self) -> list[dict]:
|
def list_client_models(self) -> list[dict]:
|
||||||
services = self.list_services()
|
services = self.list_services()
|
||||||
roles = self.list_roles()
|
roles = self.list_roles()
|
||||||
|
|
@ -1072,54 +807,6 @@ class Registry:
|
||||||
"results": json.loads(row["results_json"]),
|
"results": json.loads(row["results_json"]),
|
||||||
}
|
}
|
||||||
|
|
||||||
@staticmethod
|
|
||||||
def _client_key_row_to_dict(row: sqlite3.Row) -> dict:
|
|
||||||
return {
|
|
||||||
"key_id": row["key_id"],
|
|
||||||
"key_hash": row["key_hash"],
|
|
||||||
"display_name": row["display_name"],
|
|
||||||
"principal_type": row["principal_type"],
|
|
||||||
"principal_ref": row["principal_ref"],
|
|
||||||
"role": row["role"],
|
|
||||||
"allowed_models": json.loads(row["allowed_models_json"]),
|
|
||||||
"allowed_operations": json.loads(row["allowed_operations_json"]),
|
|
||||||
"monthly_budget_cents": row["monthly_budget_cents"],
|
|
||||||
"monthly_token_limit": row["monthly_token_limit"],
|
|
||||||
"enabled": bool(row["enabled"]),
|
|
||||||
"created_at": row["created_at"],
|
|
||||||
"updated_at": row["updated_at"],
|
|
||||||
"last_used_at": row["last_used_at"],
|
|
||||||
"notes": row["notes"],
|
|
||||||
}
|
|
||||||
|
|
||||||
@staticmethod
|
|
||||||
def _request_audit_row_to_dict(row: sqlite3.Row) -> dict:
|
|
||||||
return {
|
|
||||||
"request_id": row["request_id"],
|
|
||||||
"key_id": row["key_id"],
|
|
||||||
"principal_type": row["principal_type"],
|
|
||||||
"principal_ref": row["principal_ref"],
|
|
||||||
"operation": row["operation"],
|
|
||||||
"requested_model": row["requested_model"],
|
|
||||||
"resolved_service_id": row["resolved_service_id"],
|
|
||||||
"resolved_host_id": row["resolved_host_id"],
|
|
||||||
"upstream_model": row["upstream_model"],
|
|
||||||
"provider_kind": row["provider_kind"],
|
|
||||||
"started_at": row["started_at"],
|
|
||||||
"finished_at": row["finished_at"],
|
|
||||||
"duration_ms": row["duration_ms"],
|
|
||||||
"status_code": row["status_code"],
|
|
||||||
"success": bool(row["success"]),
|
|
||||||
"error_type": row["error_type"],
|
|
||||||
"prompt_tokens": row["prompt_tokens"],
|
|
||||||
"completion_tokens": row["completion_tokens"],
|
|
||||||
"total_tokens": row["total_tokens"],
|
|
||||||
"estimated_cost_cents": row["estimated_cost_cents"],
|
|
||||||
"input_bytes": row["input_bytes"],
|
|
||||||
"output_bytes": row["output_bytes"],
|
|
||||||
"metadata": json.loads(row["metadata_json"]),
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
def _tokenize_text(value: str) -> set[str]:
|
def _tokenize_text(value: str) -> set[str]:
|
||||||
return {token for token in re.split(r"[^a-z0-9]+", value.lower()) if token}
|
return {token for token in re.split(r"[^a-z0-9]+", value.lower()) if token}
|
||||||
|
|
|
||||||
|
|
@ -1,154 +0,0 @@
|
||||||
import json
|
|
||||||
from pathlib import Path
|
|
||||||
|
|
||||||
from fastapi.testclient import TestClient
|
|
||||||
|
|
||||||
from geniehive_control.main import create_app
|
|
||||||
from geniehive_control.models import HostRegistration, RegisteredService
|
|
||||||
from geniehive_control.upstream import UpstreamClient
|
|
||||||
|
|
||||||
|
|
||||||
class _FakeResponse:
|
|
||||||
def __init__(self, payload: dict, status_code: int = 200) -> None:
|
|
||||||
self._payload = payload
|
|
||||||
self.status_code = status_code
|
|
||||||
self.text = str(payload)
|
|
||||||
|
|
||||||
def json(self) -> dict:
|
|
||||||
return self._payload
|
|
||||||
|
|
||||||
|
|
||||||
class _UsagePoster:
|
|
||||||
async def post(self, url: str, *, json: dict, headers: dict[str, str] | None = None) -> _FakeResponse:
|
|
||||||
return _FakeResponse(
|
|
||||||
{
|
|
||||||
"object": "chat.completion",
|
|
||||||
"model": json["model"],
|
|
||||||
"choices": [{"index": 0, "message": {"role": "assistant", "content": "done"}}],
|
|
||||||
"usage": {
|
|
||||||
"prompt_tokens": 7,
|
|
||||||
"completion_tokens": 3,
|
|
||||||
"total_tokens": 10,
|
|
||||||
},
|
|
||||||
}
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
def _write_audit_config(tmp_path: Path) -> Path:
|
|
||||||
config_path = tmp_path / "control.yaml"
|
|
||||||
config_path.write_text(
|
|
||||||
f"""
|
|
||||||
auth:
|
|
||||||
client_api_keys:
|
|
||||||
- audit-key
|
|
||||||
audit:
|
|
||||||
enabled: true
|
|
||||||
admin_api:
|
|
||||||
enabled: true
|
|
||||||
storage:
|
|
||||||
sqlite_path: "{tmp_path / 'geniehive.sqlite3'}"
|
|
||||||
"""
|
|
||||||
)
|
|
||||||
return config_path
|
|
||||||
|
|
||||||
|
|
||||||
def _register_chat_service(app) -> None:
|
|
||||||
app.state.registry.register_host(
|
|
||||||
HostRegistration(
|
|
||||||
host_id="atlas-01",
|
|
||||||
address="127.0.0.1",
|
|
||||||
services=[
|
|
||||||
RegisteredService(
|
|
||||||
service_id="atlas-01/chat/qwen",
|
|
||||||
host_id="atlas-01",
|
|
||||||
kind="chat",
|
|
||||||
protocol="openai",
|
|
||||||
endpoint="http://127.0.0.1:18091",
|
|
||||||
assets=[{"asset_id": "qwen-test", "loaded": True}],
|
|
||||||
state={"health": "healthy", "accept_requests": True},
|
|
||||||
observed={"p50_latency_ms": 100},
|
|
||||||
)
|
|
||||||
],
|
|
||||||
)
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
def test_successful_chat_request_is_audited_without_prompt_content(tmp_path: Path) -> None:
|
|
||||||
app = create_app(_write_audit_config(tmp_path), upstream_client=UpstreamClient(client=_UsagePoster()))
|
|
||||||
_register_chat_service(app)
|
|
||||||
client = TestClient(app)
|
|
||||||
|
|
||||||
response = client.post(
|
|
||||||
"/v1/chat/completions",
|
|
||||||
headers={"X-Api-Key": "audit-key", "X-Request-Id": "req-test-success"},
|
|
||||||
json={
|
|
||||||
"model": "qwen-test",
|
|
||||||
"messages": [{"role": "user", "content": "private prompt text"}],
|
|
||||||
},
|
|
||||||
)
|
|
||||||
|
|
||||||
assert response.status_code == 200
|
|
||||||
assert response.headers["x-request-id"] == "req-test-success"
|
|
||||||
|
|
||||||
row = app.state.registry.get_request_audit("req-test-success")
|
|
||||||
assert row is not None
|
|
||||||
assert row["operation"] == "chat"
|
|
||||||
assert row["requested_model"] == "qwen-test"
|
|
||||||
assert row["resolved_service_id"] == "atlas-01/chat/qwen"
|
|
||||||
assert row["upstream_model"] == "qwen-test"
|
|
||||||
assert row["provider_kind"] == "openai"
|
|
||||||
assert row["success"] is True
|
|
||||||
assert row["status_code"] == 200
|
|
||||||
assert row["prompt_tokens"] == 7
|
|
||||||
assert row["completion_tokens"] == 3
|
|
||||||
assert row["total_tokens"] == 10
|
|
||||||
assert "private prompt text" not in json.dumps(row)
|
|
||||||
|
|
||||||
|
|
||||||
def test_failed_chat_route_is_audited(tmp_path: Path) -> None:
|
|
||||||
app = create_app(_write_audit_config(tmp_path), upstream_client=UpstreamClient(client=_UsagePoster()))
|
|
||||||
client = TestClient(app)
|
|
||||||
|
|
||||||
response = client.post(
|
|
||||||
"/v1/chat/completions",
|
|
||||||
headers={"X-Api-Key": "audit-key", "X-Request-Id": "req-test-failure"},
|
|
||||||
json={
|
|
||||||
"model": "missing-model",
|
|
||||||
"messages": [{"role": "user", "content": "private failure prompt"}],
|
|
||||||
},
|
|
||||||
)
|
|
||||||
|
|
||||||
assert response.status_code == 404
|
|
||||||
assert response.headers["x-request-id"] == "req-test-failure"
|
|
||||||
|
|
||||||
row = app.state.registry.get_request_audit("req-test-failure")
|
|
||||||
assert row is not None
|
|
||||||
assert row["operation"] == "chat"
|
|
||||||
assert row["requested_model"] == "missing-model"
|
|
||||||
assert row["success"] is False
|
|
||||||
assert row["status_code"] == 404
|
|
||||||
assert row["error_type"] == "proxy_error"
|
|
||||||
assert "private failure prompt" not in json.dumps(row)
|
|
||||||
|
|
||||||
|
|
||||||
def test_admin_audit_endpoints_list_and_summarize_requests(tmp_path: Path) -> None:
|
|
||||||
app = create_app(_write_audit_config(tmp_path), upstream_client=UpstreamClient(client=_UsagePoster()))
|
|
||||||
_register_chat_service(app)
|
|
||||||
client = TestClient(app)
|
|
||||||
client.post(
|
|
||||||
"/v1/chat/completions",
|
|
||||||
headers={"X-Api-Key": "audit-key"},
|
|
||||||
json={"model": "qwen-test", "messages": [{"role": "user", "content": "hello"}]},
|
|
||||||
)
|
|
||||||
|
|
||||||
listed = client.get("/v1/admin/audit/requests", headers={"X-Api-Key": "audit-key"})
|
|
||||||
assert listed.status_code == 200
|
|
||||||
assert listed.json()["data"][0]["requested_model"] == "qwen-test"
|
|
||||||
|
|
||||||
summary = client.get("/v1/admin/audit/summary", headers={"X-Api-Key": "audit-key"})
|
|
||||||
assert summary.status_code == 200
|
|
||||||
summary_row = summary.json()["data"][0]
|
|
||||||
assert summary_row["requested_model"] == "qwen-test"
|
|
||||||
assert summary_row["request_count"] == 1
|
|
||||||
assert summary_row["success_count"] == 1
|
|
||||||
assert summary_row["total_tokens"] == 10
|
|
||||||
|
|
@ -1,221 +0,0 @@
|
||||||
from pathlib import Path
|
|
||||||
|
|
||||||
import pytest
|
|
||||||
from fastapi import Depends, Request
|
|
||||||
from fastapi.testclient import TestClient
|
|
||||||
|
|
||||||
from geniehive_control.auth import require_client_auth
|
|
||||||
from geniehive_control.keys import hash_api_key
|
|
||||||
from geniehive_control.main import create_app
|
|
||||||
|
|
||||||
|
|
||||||
def _write_config(tmp_path: Path, body: str) -> Path:
|
|
||||||
config_path = tmp_path / "control.yaml"
|
|
||||||
config_path.write_text(body)
|
|
||||||
return config_path
|
|
||||||
|
|
||||||
|
|
||||||
def test_static_client_key_auth_still_works(tmp_path: Path) -> None:
|
|
||||||
config_path = _write_config(
|
|
||||||
tmp_path,
|
|
||||||
f"""
|
|
||||||
auth:
|
|
||||||
client_api_keys:
|
|
||||||
- static-key
|
|
||||||
storage:
|
|
||||||
sqlite_path: "{tmp_path / 'geniehive.sqlite3'}"
|
|
||||||
""",
|
|
||||||
)
|
|
||||||
app = create_app(config_path)
|
|
||||||
client = TestClient(app)
|
|
||||||
|
|
||||||
assert client.get("/v1/models").status_code == 401
|
|
||||||
ok = client.get("/v1/models", headers={"X-Api-Key": "static-key"})
|
|
||||||
assert ok.status_code == 200
|
|
||||||
|
|
||||||
|
|
||||||
def test_empty_static_keys_still_allow_development_access(tmp_path: Path) -> None:
|
|
||||||
config_path = _write_config(
|
|
||||||
tmp_path,
|
|
||||||
f"""
|
|
||||||
storage:
|
|
||||||
sqlite_path: "{tmp_path / 'geniehive.sqlite3'}"
|
|
||||||
""",
|
|
||||||
)
|
|
||||||
app = create_app(config_path)
|
|
||||||
client = TestClient(app)
|
|
||||||
|
|
||||||
response = client.get("/v1/models")
|
|
||||||
assert response.status_code == 200
|
|
||||||
|
|
||||||
|
|
||||||
def test_named_client_key_auth_when_enabled(tmp_path: Path, monkeypatch: pytest.MonkeyPatch) -> None:
|
|
||||||
monkeypatch.setenv("GENIEHIVE_KEY_HASH_SECRET", "test-secret")
|
|
||||||
db_path = tmp_path / "geniehive.sqlite3"
|
|
||||||
config_path = _write_config(
|
|
||||||
tmp_path,
|
|
||||||
f"""
|
|
||||||
auth:
|
|
||||||
enable_named_client_keys: true
|
|
||||||
storage:
|
|
||||||
sqlite_path: "{db_path}"
|
|
||||||
""",
|
|
||||||
)
|
|
||||||
app = create_app(config_path)
|
|
||||||
raw_key = "gh_test_named"
|
|
||||||
app.state.registry.create_client_key(
|
|
||||||
key_id="ck_named",
|
|
||||||
key_hash=hash_api_key(raw_key, secret="test-secret"),
|
|
||||||
display_name="Named User",
|
|
||||||
principal_type="person",
|
|
||||||
principal_ref="named-user",
|
|
||||||
role="developer",
|
|
||||||
allowed_models=["archive_migrator"],
|
|
||||||
allowed_operations=["chat"],
|
|
||||||
)
|
|
||||||
|
|
||||||
@app.get("/_test/client-context")
|
|
||||||
async def client_context(request: Request, _=Depends(require_client_auth)) -> dict:
|
|
||||||
context = request.state.client_context
|
|
||||||
return {
|
|
||||||
"auth_kind": context.auth_kind,
|
|
||||||
"key_id": context.key_id,
|
|
||||||
"principal_ref": context.principal_ref,
|
|
||||||
"allowed_models": list(context.allowed_models),
|
|
||||||
"allowed_operations": list(context.allowed_operations),
|
|
||||||
}
|
|
||||||
|
|
||||||
client = TestClient(app)
|
|
||||||
|
|
||||||
missing = client.get("/_test/client-context")
|
|
||||||
assert missing.status_code == 401
|
|
||||||
|
|
||||||
bad = client.get("/_test/client-context", headers={"X-Api-Key": "wrong"})
|
|
||||||
assert bad.status_code == 401
|
|
||||||
|
|
||||||
ok = client.get("/_test/client-context", headers={"X-Api-Key": raw_key})
|
|
||||||
assert ok.status_code == 200
|
|
||||||
assert ok.json() == {
|
|
||||||
"auth_kind": "named",
|
|
||||||
"key_id": "ck_named",
|
|
||||||
"principal_ref": "named-user",
|
|
||||||
"allowed_models": ["archive_migrator"],
|
|
||||||
"allowed_operations": ["chat"],
|
|
||||||
}
|
|
||||||
touched = app.state.registry.get_client_key("ck_named")
|
|
||||||
assert touched is not None
|
|
||||||
assert touched["last_used_at"] is not None
|
|
||||||
|
|
||||||
|
|
||||||
def test_disabled_named_client_key_fails(tmp_path: Path, monkeypatch: pytest.MonkeyPatch) -> None:
|
|
||||||
monkeypatch.setenv("GENIEHIVE_KEY_HASH_SECRET", "test-secret")
|
|
||||||
db_path = tmp_path / "geniehive.sqlite3"
|
|
||||||
config_path = _write_config(
|
|
||||||
tmp_path,
|
|
||||||
f"""
|
|
||||||
auth:
|
|
||||||
enable_named_client_keys: true
|
|
||||||
storage:
|
|
||||||
sqlite_path: "{db_path}"
|
|
||||||
""",
|
|
||||||
)
|
|
||||||
app = create_app(config_path)
|
|
||||||
raw_key = "gh_test_disabled"
|
|
||||||
app.state.registry.create_client_key(
|
|
||||||
key_id="ck_disabled",
|
|
||||||
key_hash=hash_api_key(raw_key, secret="test-secret"),
|
|
||||||
display_name="Disabled User",
|
|
||||||
principal_type="person",
|
|
||||||
principal_ref="disabled-user",
|
|
||||||
enabled=False,
|
|
||||||
)
|
|
||||||
client = TestClient(app)
|
|
||||||
|
|
||||||
response = client.get("/v1/models", headers={"X-Api-Key": raw_key})
|
|
||||||
assert response.status_code == 401
|
|
||||||
|
|
||||||
|
|
||||||
def test_admin_client_key_endpoints_are_hidden_by_default() -> None:
|
|
||||||
app = create_app()
|
|
||||||
paths = {route.path for route in app.routes}
|
|
||||||
|
|
||||||
assert "/v1/admin/client-keys" not in paths
|
|
||||||
|
|
||||||
|
|
||||||
def test_admin_can_create_list_disable_and_enable_named_keys(
|
|
||||||
tmp_path: Path,
|
|
||||||
monkeypatch: pytest.MonkeyPatch,
|
|
||||||
) -> None:
|
|
||||||
monkeypatch.setenv("GENIEHIVE_KEY_HASH_SECRET", "test-secret")
|
|
||||||
db_path = tmp_path / "geniehive.sqlite3"
|
|
||||||
config_path = _write_config(
|
|
||||||
tmp_path,
|
|
||||||
f"""
|
|
||||||
auth:
|
|
||||||
client_api_keys:
|
|
||||||
- admin-static-key
|
|
||||||
enable_named_client_keys: true
|
|
||||||
admin_api:
|
|
||||||
enabled: true
|
|
||||||
storage:
|
|
||||||
sqlite_path: "{db_path}"
|
|
||||||
""",
|
|
||||||
)
|
|
||||||
app = create_app(config_path)
|
|
||||||
client = TestClient(app)
|
|
||||||
|
|
||||||
denied = client.get("/v1/admin/client-keys")
|
|
||||||
assert denied.status_code == 401
|
|
||||||
|
|
||||||
created = client.post(
|
|
||||||
"/v1/admin/client-keys",
|
|
||||||
headers={"X-Api-Key": "admin-static-key"},
|
|
||||||
json={
|
|
||||||
"key_id": "ck_created",
|
|
||||||
"display_name": "Archive Migration",
|
|
||||||
"principal_type": "person",
|
|
||||||
"principal_ref": "wesley",
|
|
||||||
"role": "developer",
|
|
||||||
"allowed_models": ["archive_migrator"],
|
|
||||||
"allowed_operations": ["chat"],
|
|
||||||
},
|
|
||||||
)
|
|
||||||
assert created.status_code == 200
|
|
||||||
created_body = created.json()
|
|
||||||
assert created_body["api_key"].startswith("gh_")
|
|
||||||
assert created_body["client_key"]["key_id"] == "ck_created"
|
|
||||||
assert "key_hash" not in created_body["client_key"]
|
|
||||||
|
|
||||||
listed = client.get(
|
|
||||||
"/v1/admin/client-keys",
|
|
||||||
headers={"X-Api-Key": "admin-static-key"},
|
|
||||||
)
|
|
||||||
assert listed.status_code == 200
|
|
||||||
assert listed.json()["data"][0]["key_id"] == "ck_created"
|
|
||||||
assert "key_hash" not in listed.json()["data"][0]
|
|
||||||
|
|
||||||
disabled = client.post(
|
|
||||||
"/v1/admin/client-keys/ck_created/disable",
|
|
||||||
headers={"X-Api-Key": "admin-static-key"},
|
|
||||||
)
|
|
||||||
assert disabled.status_code == 200
|
|
||||||
assert disabled.json()["client_key"]["enabled"] is False
|
|
||||||
|
|
||||||
named_denied = client.get(
|
|
||||||
"/v1/models",
|
|
||||||
headers={"X-Api-Key": created_body["api_key"]},
|
|
||||||
)
|
|
||||||
assert named_denied.status_code == 401
|
|
||||||
|
|
||||||
enabled = client.post(
|
|
||||||
"/v1/admin/client-keys/ck_created/enable",
|
|
||||||
headers={"X-Api-Key": "admin-static-key"},
|
|
||||||
)
|
|
||||||
assert enabled.status_code == 200
|
|
||||||
assert enabled.json()["client_key"]["enabled"] is True
|
|
||||||
|
|
||||||
named_ok = client.get(
|
|
||||||
"/v1/models",
|
|
||||||
headers={"X-Api-Key": created_body["api_key"]},
|
|
||||||
)
|
|
||||||
assert named_ok.status_code == 200
|
|
||||||
|
|
@ -1,202 +0,0 @@
|
||||||
from pathlib import Path
|
|
||||||
|
|
||||||
import pytest
|
|
||||||
from fastapi.testclient import TestClient
|
|
||||||
|
|
||||||
from geniehive_control.keys import hash_api_key
|
|
||||||
from geniehive_control.main import create_app
|
|
||||||
from geniehive_control.models import HostRegistration, RegisteredService
|
|
||||||
from geniehive_control.upstream import UpstreamClient
|
|
||||||
|
|
||||||
|
|
||||||
class _FakeResponse:
|
|
||||||
def __init__(self, payload: dict, status_code: int = 200) -> None:
|
|
||||||
self._payload = payload
|
|
||||||
self.status_code = status_code
|
|
||||||
self.text = str(payload)
|
|
||||||
|
|
||||||
def json(self) -> dict:
|
|
||||||
return self._payload
|
|
||||||
|
|
||||||
|
|
||||||
class _FakePoster:
|
|
||||||
async def post(self, url: str, *, json: dict, headers: dict[str, str] | None = None) -> _FakeResponse:
|
|
||||||
if url.endswith("/v1/embeddings"):
|
|
||||||
return _FakeResponse({"object": "list", "data": [{"embedding": [0.1, 0.2]}]})
|
|
||||||
return _FakeResponse({"object": "chat.completion", "model": json["model"], "choices": []})
|
|
||||||
|
|
||||||
|
|
||||||
def _write_config(tmp_path: Path, *, static_key: bool = False) -> Path:
|
|
||||||
config_path = tmp_path / "control.yaml"
|
|
||||||
static_auth = """
|
|
||||||
client_api_keys:
|
|
||||||
- static-key
|
|
||||||
""" if static_key else ""
|
|
||||||
config_path.write_text(
|
|
||||||
f"""
|
|
||||||
auth:
|
|
||||||
{static_auth} enable_named_client_keys: true
|
|
||||||
authorization:
|
|
||||||
enforce_model_allowlists: true
|
|
||||||
enforce_operation_allowlists: true
|
|
||||||
empty_allowlist_means_no_access: true
|
|
||||||
storage:
|
|
||||||
sqlite_path: "{tmp_path / 'geniehive.sqlite3'}"
|
|
||||||
"""
|
|
||||||
)
|
|
||||||
return config_path
|
|
||||||
|
|
||||||
|
|
||||||
def _register_services(app) -> None:
|
|
||||||
app.state.registry.register_host(
|
|
||||||
HostRegistration(
|
|
||||||
host_id="atlas-01",
|
|
||||||
address="127.0.0.1",
|
|
||||||
services=[
|
|
||||||
RegisteredService(
|
|
||||||
service_id="atlas-01/chat/qwen",
|
|
||||||
host_id="atlas-01",
|
|
||||||
kind="chat",
|
|
||||||
endpoint="http://127.0.0.1:18091",
|
|
||||||
assets=[{"asset_id": "archive_migrator", "loaded": True}],
|
|
||||||
state={"health": "healthy", "accept_requests": True},
|
|
||||||
observed={"p50_latency_ms": 100},
|
|
||||||
),
|
|
||||||
RegisteredService(
|
|
||||||
service_id="atlas-01/embeddings/bge",
|
|
||||||
host_id="atlas-01",
|
|
||||||
kind="embeddings",
|
|
||||||
endpoint="http://127.0.0.1:18092",
|
|
||||||
assets=[{"asset_id": "bge-small", "loaded": True}],
|
|
||||||
state={"health": "healthy", "accept_requests": True},
|
|
||||||
observed={"p50_latency_ms": 100},
|
|
||||||
),
|
|
||||||
],
|
|
||||||
)
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
def _create_named_key(
|
|
||||||
app,
|
|
||||||
raw_key: str,
|
|
||||||
*,
|
|
||||||
allowed_models: list[str],
|
|
||||||
allowed_operations: list[str],
|
|
||||||
) -> None:
|
|
||||||
app.state.registry.create_client_key(
|
|
||||||
key_id=f"ck_{raw_key}",
|
|
||||||
key_hash=hash_api_key(raw_key, secret="test-secret"),
|
|
||||||
display_name="Scoped User",
|
|
||||||
principal_type="person",
|
|
||||||
principal_ref="scoped-user",
|
|
||||||
role="developer",
|
|
||||||
allowed_models=allowed_models,
|
|
||||||
allowed_operations=allowed_operations,
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
def test_named_key_allows_scoped_chat_request(tmp_path: Path, monkeypatch: pytest.MonkeyPatch) -> None:
|
|
||||||
monkeypatch.setenv("GENIEHIVE_KEY_HASH_SECRET", "test-secret")
|
|
||||||
app = create_app(_write_config(tmp_path), upstream_client=UpstreamClient(client=_FakePoster()))
|
|
||||||
_register_services(app)
|
|
||||||
_create_named_key(
|
|
||||||
app,
|
|
||||||
"gh_allowed",
|
|
||||||
allowed_models=["archive_migrator"],
|
|
||||||
allowed_operations=["chat"],
|
|
||||||
)
|
|
||||||
client = TestClient(app)
|
|
||||||
|
|
||||||
response = client.post(
|
|
||||||
"/v1/chat/completions",
|
|
||||||
headers={"X-Api-Key": "gh_allowed"},
|
|
||||||
json={"model": "archive_migrator", "messages": [{"role": "user", "content": "hello"}]},
|
|
||||||
)
|
|
||||||
|
|
||||||
assert response.status_code == 200
|
|
||||||
|
|
||||||
|
|
||||||
def test_named_key_denies_unlisted_operation(tmp_path: Path, monkeypatch: pytest.MonkeyPatch) -> None:
|
|
||||||
monkeypatch.setenv("GENIEHIVE_KEY_HASH_SECRET", "test-secret")
|
|
||||||
app = create_app(_write_config(tmp_path), upstream_client=UpstreamClient(client=_FakePoster()))
|
|
||||||
_register_services(app)
|
|
||||||
_create_named_key(
|
|
||||||
app,
|
|
||||||
"gh_chat_only",
|
|
||||||
allowed_models=["*"],
|
|
||||||
allowed_operations=["chat"],
|
|
||||||
)
|
|
||||||
client = TestClient(app)
|
|
||||||
|
|
||||||
response = client.post(
|
|
||||||
"/v1/embeddings",
|
|
||||||
headers={"X-Api-Key": "gh_chat_only"},
|
|
||||||
json={"model": "bge-small", "input": "hello"},
|
|
||||||
)
|
|
||||||
|
|
||||||
assert response.status_code == 403
|
|
||||||
assert response.json()["error"]["code"] == "authorization_error"
|
|
||||||
|
|
||||||
|
|
||||||
def test_named_key_denies_unlisted_model(tmp_path: Path, monkeypatch: pytest.MonkeyPatch) -> None:
|
|
||||||
monkeypatch.setenv("GENIEHIVE_KEY_HASH_SECRET", "test-secret")
|
|
||||||
app = create_app(_write_config(tmp_path), upstream_client=UpstreamClient(client=_FakePoster()))
|
|
||||||
_register_services(app)
|
|
||||||
_create_named_key(
|
|
||||||
app,
|
|
||||||
"gh_archive_only",
|
|
||||||
allowed_models=["archive_migrator"],
|
|
||||||
allowed_operations=["chat"],
|
|
||||||
)
|
|
||||||
client = TestClient(app)
|
|
||||||
|
|
||||||
response = client.post(
|
|
||||||
"/v1/chat/completions",
|
|
||||||
headers={"X-Api-Key": "gh_archive_only"},
|
|
||||||
json={"model": "other_role", "messages": [{"role": "user", "content": "hello"}]},
|
|
||||||
)
|
|
||||||
|
|
||||||
assert response.status_code == 403
|
|
||||||
assert response.json()["error"]["code"] == "authorization_error"
|
|
||||||
|
|
||||||
|
|
||||||
def test_empty_allowlist_denies_when_configured(tmp_path: Path, monkeypatch: pytest.MonkeyPatch) -> None:
|
|
||||||
monkeypatch.setenv("GENIEHIVE_KEY_HASH_SECRET", "test-secret")
|
|
||||||
app = create_app(_write_config(tmp_path), upstream_client=UpstreamClient(client=_FakePoster()))
|
|
||||||
_register_services(app)
|
|
||||||
_create_named_key(
|
|
||||||
app,
|
|
||||||
"gh_empty",
|
|
||||||
allowed_models=[],
|
|
||||||
allowed_operations=[],
|
|
||||||
)
|
|
||||||
client = TestClient(app)
|
|
||||||
|
|
||||||
response = client.post(
|
|
||||||
"/v1/chat/completions",
|
|
||||||
headers={"X-Api-Key": "gh_empty"},
|
|
||||||
json={"model": "archive_migrator", "messages": [{"role": "user", "content": "hello"}]},
|
|
||||||
)
|
|
||||||
|
|
||||||
assert response.status_code == 403
|
|
||||||
|
|
||||||
|
|
||||||
def test_static_key_is_not_restricted_by_named_key_allowlists(
|
|
||||||
tmp_path: Path,
|
|
||||||
monkeypatch: pytest.MonkeyPatch,
|
|
||||||
) -> None:
|
|
||||||
monkeypatch.setenv("GENIEHIVE_KEY_HASH_SECRET", "test-secret")
|
|
||||||
app = create_app(
|
|
||||||
_write_config(tmp_path, static_key=True),
|
|
||||||
upstream_client=UpstreamClient(client=_FakePoster()),
|
|
||||||
)
|
|
||||||
_register_services(app)
|
|
||||||
client = TestClient(app)
|
|
||||||
|
|
||||||
response = client.post(
|
|
||||||
"/v1/embeddings",
|
|
||||||
headers={"X-Api-Key": "static-key"},
|
|
||||||
json={"model": "bge-small", "input": "hello"},
|
|
||||||
)
|
|
||||||
|
|
||||||
assert response.status_code == 200
|
|
||||||
|
|
@ -1,45 +0,0 @@
|
||||||
from pathlib import Path
|
|
||||||
|
|
||||||
from geniehive_control.config import ControlConfig, load_config
|
|
||||||
|
|
||||||
|
|
||||||
def test_default_control_config_is_casual_and_non_governed() -> None:
|
|
||||||
cfg = ControlConfig()
|
|
||||||
|
|
||||||
assert cfg.deployment_profile == "casual"
|
|
||||||
assert cfg.auth.client_api_keys == []
|
|
||||||
assert cfg.auth.node_api_keys == []
|
|
||||||
assert cfg.auth.enable_named_client_keys is False
|
|
||||||
assert cfg.audit.enabled is False
|
|
||||||
assert cfg.admin_api.enabled is False
|
|
||||||
assert cfg.authorization.enforce_model_allowlists is False
|
|
||||||
assert cfg.authorization.enforce_operation_allowlists is False
|
|
||||||
assert cfg.providers == []
|
|
||||||
assert cfg.budgeting.enabled is False
|
|
||||||
|
|
||||||
|
|
||||||
def test_legacy_control_example_loads_without_foundation_sections() -> None:
|
|
||||||
cfg = load_config(Path("configs/control.example.yaml"))
|
|
||||||
|
|
||||||
assert cfg.deployment_profile == "casual"
|
|
||||||
assert cfg.auth.client_api_keys == ["change-me-client-key"]
|
|
||||||
assert cfg.auth.node_api_keys == ["change-me-node-key"]
|
|
||||||
assert cfg.auth.enable_named_client_keys is False
|
|
||||||
assert cfg.audit.enabled is False
|
|
||||||
assert cfg.admin_api.enabled is False
|
|
||||||
assert cfg.providers == []
|
|
||||||
|
|
||||||
|
|
||||||
def test_foundation_control_example_loads_as_opt_in_profile() -> None:
|
|
||||||
cfg = load_config(Path("configs/control.foundation.example.yaml"))
|
|
||||||
|
|
||||||
assert cfg.deployment_profile == "foundation_gateway"
|
|
||||||
assert cfg.auth.enable_named_client_keys is True
|
|
||||||
assert cfg.audit.enabled is True
|
|
||||||
assert cfg.admin_api.enabled is True
|
|
||||||
assert cfg.authorization.enforce_model_allowlists is True
|
|
||||||
assert cfg.authorization.enforce_operation_allowlists is True
|
|
||||||
assert cfg.providers[0].provider_id == "openai-foundation"
|
|
||||||
assert cfg.providers[0].api_key_env == "OPENAI_API_KEY"
|
|
||||||
assert cfg.providers[1].provider_kind == "anthropic_messages"
|
|
||||||
assert cfg.budgeting.global_monthly_budget_cents == 5000
|
|
||||||
|
|
@ -1,60 +0,0 @@
|
||||||
from pathlib import Path
|
|
||||||
|
|
||||||
from geniehive_control.keys import generate_api_key, hash_api_key, redact_api_key, verify_api_key
|
|
||||||
from geniehive_control.registry import Registry
|
|
||||||
|
|
||||||
|
|
||||||
def test_api_key_hash_verify_and_redact() -> None:
|
|
||||||
raw_key = generate_api_key(prefix="gh_test")
|
|
||||||
key_hash = hash_api_key(raw_key, secret="test-secret")
|
|
||||||
|
|
||||||
assert raw_key.startswith("gh_test_")
|
|
||||||
assert key_hash.startswith("hmac-sha256:")
|
|
||||||
assert verify_api_key(raw_key, key_hash, secret="test-secret") is True
|
|
||||||
assert verify_api_key(raw_key + "-wrong", key_hash, secret="test-secret") is False
|
|
||||||
assert verify_api_key(raw_key, key_hash, secret="other-secret") is False
|
|
||||||
assert raw_key not in redact_api_key(raw_key)
|
|
||||||
|
|
||||||
|
|
||||||
def test_registry_client_key_lifecycle(tmp_path: Path) -> None:
|
|
||||||
registry = Registry(tmp_path / "geniehive.sqlite3")
|
|
||||||
raw_key = "gh_test_secret"
|
|
||||||
key_hash = hash_api_key(raw_key, secret="test-secret")
|
|
||||||
|
|
||||||
created = registry.create_client_key(
|
|
||||||
key_id="ck_test",
|
|
||||||
key_hash=key_hash,
|
|
||||||
display_name="Test User",
|
|
||||||
principal_type="person",
|
|
||||||
principal_ref="test-user",
|
|
||||||
role="developer",
|
|
||||||
allowed_models=["archive_migrator"],
|
|
||||||
allowed_operations=["chat"],
|
|
||||||
monthly_budget_cents=1000,
|
|
||||||
monthly_token_limit=20000,
|
|
||||||
notes="created by test",
|
|
||||||
)
|
|
||||||
|
|
||||||
assert created["key_id"] == "ck_test"
|
|
||||||
assert created["key_hash"] == key_hash
|
|
||||||
assert created["display_name"] == "Test User"
|
|
||||||
assert created["allowed_models"] == ["archive_migrator"]
|
|
||||||
assert created["allowed_operations"] == ["chat"]
|
|
||||||
assert created["enabled"] is True
|
|
||||||
assert created["last_used_at"] is None
|
|
||||||
|
|
||||||
listed = registry.list_client_keys()
|
|
||||||
assert [item["key_id"] for item in listed] == ["ck_test"]
|
|
||||||
|
|
||||||
by_hash = registry.get_client_key_by_hash(key_hash)
|
|
||||||
assert by_hash is not None
|
|
||||||
assert by_hash["principal_ref"] == "test-user"
|
|
||||||
|
|
||||||
disabled = registry.set_client_key_enabled("ck_test", False)
|
|
||||||
assert disabled is not None
|
|
||||||
assert disabled["enabled"] is False
|
|
||||||
|
|
||||||
registry.touch_client_key("ck_test")
|
|
||||||
touched = registry.get_client_key("ck_test")
|
|
||||||
assert touched is not None
|
|
||||||
assert touched["last_used_at"] is not None
|
|
||||||
Loading…
Reference in New Issue