# RoleMesh Gateway RoleMesh Gateway is a lightweight **OpenAI-compatible** API gateway for routing chat-completions requests to multiple locally hosted LLM backends (e.g., `llama.cpp` `llama-server`) **by role** (planner, writer, coder, reviewer, …). It is designed for **agentic workflows** that benefit from using different models for different steps, and for deployments where **different machines host different models** (e.g., GPU box for fast inference, big RAM CPU box for large models). ## What you get - OpenAI-compatible endpoints: - `GET /v1/models` - `POST /v1/chat/completions` (streaming and non-streaming) - `GET /health` and `GET /ready` - Model registry from `configs/models.yaml` - Optional **node registration** so remote machines can announce role backends to the gateway - Robust proxying with **explicit httpx timeouts** (no “hang forever”) - Structured logging with request IDs ## Quick start (proxy mode) 1. Create a venv and install: ```bash python -m venv .venv source .venv/bin/activate pip install -e . ``` 2. Copy the example config: ```bash cp configs/models.example.yaml configs/models.yaml ``` 3. Run the gateway: ```bash ROLE_MESH_CONFIG=configs/models.yaml uvicorn rolemesh_gateway.main:app --host 0.0.0.0 --port 8000 ``` 4. Smoke test: ```bash bash scripts/smoke_test.sh http://127.0.0.1:8000 ``` ## Multi-host (node registration) If you want machines to host backends and “register” them dynamically, run a tiny node agent on each backend host (or just call the registration endpoint from your own tooling). - Gateway endpoint: `POST /v1/nodes/register` - Node payload describes which **roles** it serves and the base URL to reach its OpenAI-compatible backend. See: `docs/DEPLOYMENT.md` and `docs/CONFIG.md`. ## Status This repository is a **preliminary scaffold**: - Proxying to OpenAI-compatible upstreams works. - Registration and load-selection are implemented (basic round-robin), but persistence and auth are TODOs. ## License MIT. See `LICENSE`. ## Node Agent (per-host) This repo also includes a **RoleMesh Node Agent** (`rolemesh-node-agent`) that can manage **persistent** `llama.cpp` servers (one per GPU) and report inventory/metrics back to the gateway. - Sample config: `configs/node_agent.example.yaml` - Docs: `docs/NODE_AGENT.md` ## Safe-by-default binding Gateway and node-agent default to binding on `127.0.0.1` to avoid accidental exposure. Bind only to private/LAN or VPN interfaces and firewall ports if you need remote access.