RoleMesh-Gateway/docs/ARCHITECTURE.md

# Architecture

RoleMesh Gateway sits between OpenAI-compatible clients and one or more upstream backends.

## Goals

- Present a stable OpenAI-like API surface to tools (agents, IDEs, scripts)
- Route by **role**, not by a specific model binary
- Support both:
  - single-host (gateway + backends on the same machine)
  - multi-host (different machines serve different roles/models)

## Request flow

1. Client sends `POST /v1/chat/completions` with `model: "<role>"`
2. Gateway resolves the role via:
   - `type: proxy` → fixed `proxy_url`, or
   - `type: discovered` → pick from registered nodes serving that role
3. Gateway forwards the request (and streams responses if requested)
4. Gateway returns an OpenAI-like response

## Registration model

Registration is optional and minimal:
- A node announces `(node_id, base_url, roles, meta)`
- The gateway stores this in memory (and optionally in `state/registry.json`)
- For `type: discovered`, the gateway picks a node by selection strategy

This is deliberately small so you can swap it out later for something stronger:
- Consul, etcd, Redis
- mDNS discovery on LAN
- static inventory in Ansible/systemd units

## Known limitations (scaffold)

- No auth on registration or inference endpoints
- No TTL/health polling
- Round-robin selection only

These are tracked in `docs/DEPLOYMENT.md` as next steps.