RoleMesh-Gateway/docs/ARCHITECTURE.md

41 lines
1.3 KiB
Markdown

# Architecture
RoleMesh Gateway sits between OpenAI-compatible clients and one or more upstream backends.
## Goals
- Present a stable OpenAI-like API surface to tools (agents, IDEs, scripts)
- Route by **role**, not by a specific model binary
- Support both:
- single-host (gateway + backends on the same machine)
- multi-host (different machines serve different roles/models)
## Request flow
1. Client sends `POST /v1/chat/completions` with `model: "<role>"`
2. Gateway resolves the role via:
- `type: proxy` → fixed `proxy_url`, or
- `type: discovered` → pick from registered nodes serving that role
3. Gateway forwards the request (and streams responses if requested)
4. Gateway returns an OpenAI-like response
## Registration model
Registration is optional and minimal:
- A node announces `(node_id, base_url, roles, meta)`
- The gateway stores this in memory (and optionally in `state/registry.json`)
- For `type: discovered`, the gateway picks a node by selection strategy
This is deliberately small so you can swap it out later for something stronger:
- Consul, etcd, Redis
- mDNS discovery on LAN
- static inventory in Ansible/systemd units
## Known limitations (scaffold)
- No auth on registration or inference endpoints
- No TTL/health polling
- Round-robin selection only
These are tracked in `docs/DEPLOYMENT.md` as next steps.