42 lines
1.4 KiB
Markdown
42 lines
1.4 KiB
Markdown
# Architecture
|
|
|
|
RoleMesh Gateway sits between OpenAI-compatible clients and one or more upstream backends.
|
|
|
|
## Goals
|
|
|
|
- Present a stable OpenAI-like API surface to tools (agents, IDEs, scripts)
|
|
- Route by **role**, not by a specific model binary
|
|
- Support both:
|
|
- single-host (gateway + backends on the same machine)
|
|
- multi-host (different machines serve different roles/models)
|
|
|
|
## Request flow
|
|
|
|
1. Client sends `POST /v1/chat/completions` with `model: "<role>"`
|
|
2. Gateway resolves the role via:
|
|
- `type: proxy` → fixed `proxy_url`, or
|
|
- `type: discovered` → pick from registered nodes serving that role
|
|
3. Gateway forwards the request (and streams responses if requested)
|
|
4. Gateway returns an OpenAI-like response
|
|
|
|
## Registration model
|
|
|
|
Registration is optional and minimal:
|
|
- A node announces `(node_id, base_url, roles, meta)`
|
|
- The gateway stores this in memory (and optionally in `state/registry.json`)
|
|
- For `type: discovered`, the gateway picks a node by selection strategy
|
|
|
|
This is deliberately small so you can swap it out later for something stronger:
|
|
- Consul, etcd, Redis
|
|
- mDNS discovery on LAN
|
|
- static inventory in Ansible/systemd units
|
|
|
|
## Known limitations (scaffold)
|
|
|
|
- Auth is optional and config-driven rather than enforced by default
|
|
- No TTL/health polling
|
|
- No automatic config reload
|
|
- Selection strategies are intentionally simple and limited to `round_robin` and `random`
|
|
|
|
These are tracked in `docs/DEPLOYMENT.md` as next steps.
|