1.3 KiB

Raw Blame History

Architecture

RoleMesh Gateway sits between OpenAI-compatible clients and one or more upstream backends.

Goals

Present a stable OpenAI-like API surface to tools (agents, IDEs, scripts)
Route by role, not by a specific model binary
Support both:
- single-host (gateway + backends on the same machine)
- multi-host (different machines serve different roles/models)

Request flow

Client sends POST /v1/chat/completions with model: "<role>"
Gateway resolves the role via:
- type: proxy → fixed proxy_url, or
- type: discovered → pick from registered nodes serving that role
Gateway forwards the request (and streams responses if requested)
Gateway returns an OpenAI-like response

Registration model

Registration is optional and minimal:

A node announces (node_id, base_url, roles, meta)
The gateway stores this in memory (and optionally in state/registry.json)
For type: discovered, the gateway picks a node by selection strategy

This is deliberately small so you can swap it out later for something stronger:

Consul, etcd, Redis
mDNS discovery on LAN
static inventory in Ansible/systemd units

Known limitations (scaffold)

No auth on registration or inference endpoints
No TTL/health polling
Round-robin selection only

These are tracked in docs/DEPLOYMENT.md as next steps.